Obviously, in the vast majority of cases, there's only one device
in the system, but doing this means we're more likely to get a
usable device in the multi-device case.
cuda would support decoding on one device and displaying on another
but the peer memory handling is not transparent and I have no way
to test it so I can't really write it.