nvcodec: low resolutions transcode faster with host memory, high resolutions faster with GL memory
I noticed that when transcoding using nvh264dec and nvh264enc, lower resolutions perform better when using system/host memory instead of GL memory. Higher resolutions perform better when using GL memory instead of system/host memory.
If you profile the pipelines using nvprof, the memory copy operations seem in line with what you'd expect: device to host memory copies are slower than device to device. Since the memory copy operation performance seems as expected, what could be the cause of this slower performance and why does it only affect lower resolutions?