I have successfully implemented a video player using ffmpeg. I am now trying to use hardware decoding but I'm facing a couple issues. I found a post that I followed as a starting point here: https://ffmpeg.org/pipermail/libav-user/2014-August/007323.html
I have updated the code that setup the necessary stuff for the decoder. The updated code is available here: https://drive.google.com/file/d/0B5ufHdoDzA4ieVk5UVpxcDNzRHc/view?usp=sharing
And this is how I'm using it to initialize the decoder:
// Prepare the decoding context
AVCodec *codec = nullptr;
_codecContext = _avFormatContext->streams[_streamIndex]->codec;
if ((codec = avcodec_find_decoder(_codecContext->codec_id)) == 0)
{
std::cout << "Unsupported video codec!" << std::endl;
return false;
}
_codecContext->thread_count = 1; // Multithreading is apparently not compatible with hardware decoding
InputStream *ist = new InputStream();
ist->hwaccel_id = HWACCEL_AUTO;
ist->hwaccel_device = "dxva2";
ist->dec = codec;
ist->dec_ctx = _codecContext;
_codecContext->coded_width = _width;
_codecContext->coded_height = _height;
_codecContext->opaque = ist;
dxva2_init(_codecContext);
_codecContext->get_buffer2 = ist->hwaccel_get_buffer;
_codecContext->get_format = GetHwFormat;
_codecContext->thread_safe_callbacks = 1;
if (avcodec_open2(_codecContext, codec, nullptr) < 0)
{
std::cout << "Video codec open error" << std::endl;
return false;
}
And here is the definition of GetHwFormat referenced above:
AVPixelFormat GetHwFormat(AVCodecContext *s, const AVPixelFormat *pix_fmts)
{
InputStream* ist = (InputStream*)s->opaque;
ist->active_hwaccel_id = HWACCEL_DXVA2;
ist->hwaccel_pix_fmt = AV_PIX_FMT_DXVA2_VLD;
return ist->hwaccel_pix_fmt;
}
When I open an mp4 (encoded in h264) video that is HD resolution or less, everything seems to be working fine. However, as soon as I try higher resolution videos like 3840x2160, I get the following errors repeatedly:
Failed to execute: 0x80070057
Hardware accelerator failed to decode picture
I also start getting the following errors after a few seconds:
co located POCs unavailable
And the video is not displayed properly: I get a lot of artifacts all over the video and it is lagging. I checked the first error in the ffmpeg source code. It seems that IDirectXVideoDecoder_Execute fails because of an invalid parameter. Since this is happening withing ffmpeg, there must be something that I'm missing but I can't figure out what. The only relevant post that I found with this error was because of multithreading but I set the thread_count to 1 before opening the codec.
This issue is happening on my main computer which has the following specs:
The same issue is not happening on my second computer which has the following specs:
If I use DXVAChecker on my main computer, it says that my graphics card supports DXVA2 for H264_VLD_*, and I can see that the calls to the Microsoft API are being made (DXVA2_DecodeDeviceCreated, DXVA2_DecodeDeviceBeginFrame, DXVA2_DecodeDeviceGetBuffer, DXVA2_DecodeDeviceExecute, DXVA2_DecodeDeviceEndFrame) while my video is playing.
I also don't see any increase of GPU usage (on either computer) between the version with hardware decoding and the version without; however, I do see a decrease in CPU usage (not as much as I was expecting though). This is also very strange.
Note that I tried both the Windows release available on the FFmpeg website, and a version that I compiled with --enable-dxva2. I have searched a lot already but I was unable to find what I'm doing wrong.
Hopefully, someone can help me, or maybe point me to a better example?
I finally found out what my issue was. After calling avcodec_decode_video2, I was not updating the size and data pointer of the packet like this:
int r = avcodec_decode_video2(_codecContext, frame, &frameDecoded, &pkt);
pkt.size -= r;
pkt.data += r;
Now, the video is properly decoded and I have no artifacts anymore.
Also, regarding the lag, I believe this was a separate issue, non related to the error messages, and due to the time that it takes to copy the image back to the CPU's memory. If you need to do this, instead of using av_image_copy_plane like in the code that I posted with my question above, you may want to look at what VLC does, or at this link https://software.intel.com/en-us/articles/copying-accelerated-video-decode-frame-buffers. I did a quick test on my machine and it reduced the time by a factor of 7 or 8.