Buffering while converting stream to frames with ffmpeg

Pavel K. picture Pavel K. · Feb 11, 2016 · Viewed 20k times · Source

I am trying to convert udp stream into frames using ffmpeg. I run following command:

ffmpeg -loglevel debug -strict 2 -re -i "udp://192.168.15.50:3200?fifo_size=1000000&overrun_nonfatal=1" -r 8 -vf scale=432:243 -f image2pipe -vcodec ppm pipe:1

It happens with different stream types, mpeg2video and h264. Cpu load for core processing this specific stream is under 30%, its low quality sd stream with resolution of 640x576.

It works well for the most time, however sometimes, once in a while, latency occurs and frames arrive later. So i want exactly 8 fps but sometime i get less, sometimes more.

Why does this latency occur and how can i reduce it?

update: i tried changing it to:

ffmpeg -loglevel debug -i "udp://192.168.15.50:3200?fifo_size=1000000&overrun_nonfatal=1" -r 8 -preset ultrafast -fflags nobuffer -vf scale=432:243 -f image2pipe -vcodec ppm pipe:1

But i still get the issue. For example, in ffmpeg log i get:

[2016/02/11 13:32:30] frame= 7477 fps=8.0 q=-0.0 size= 2299638kB time=00:15:34.62 bitrate=20156.4kbits/s dup=7 drop=15867 ^M*** dropping frame 7477 from stream 0 at ts 7475
[2016/02/11 13:32:30] ***dropping frame 7477 from stream 0 at ts 7476
[2016/02/11 13:32:30] ***dropping frame 7478 from stream 0 at ts 7476
[2016/02/11 13:32:32] Last message repeated 1 times
[2016/02/11 13:32:32] frame= 7479 fps=8.0 q=-0.0 size= 2300253kB time=00:15:34.87 bitrate=20156.4kbits/s dup=7 drop=15871 ^M*** dropping frame 7479 from stream 0 at ts 7477

As you can see, during second 31, no frames are output... and ffmpeg reported time between two frames is 0.25s

Answer

Pavel K. picture Pavel K. · Feb 19, 2016

ffmpeg command posted in the question is normally piped into another binary. That binary saves frames provided by ffmpeg and does some processing on them.

In the beginning I didn't use the "fifo_size=1000000&overrun_nonfatal=1" options, and I was getting the following error from ffmpeg:

[udp @ 0x4ceb8a0] Circular buffer overrun. To avoid, increase fifo_size URL option. To survive in such case, use overrun_nonfatal option
udp://192.168.15.50:3200: Input/output error

and then ffmpeg would crash. To avoid it I added : "fifo_size=1000000&overrun_nonfatal=1", as ffmpeg suggests.

However, after using those parameters, I would get timeshift as described in the question, and sometimes it would also come with artifacts in frames.

As mentioned, there was no issues with CPU, so initially, we suspected the udp stream, specifically udp buffer size:

https://access.redhat.com/documentation/en-US/JBoss_Enterprise_Web_Platform/5/html/Administration_And_Configuration_Guide/jgroups-perf-udpbuffer.html

so we changed the udp buffer size with:

sysctl -w net.core.rmem_max=26214400

and changed ffmpeg command to "udp://231.20.20.8:2005?buffer_size=26214400"

However, this didn't fix the issue. ffmpeg would still get "Circular buffer overrun" and crash. And I couldn't reproduce this circular buffer overrun, it was just happening randomly.

My next thought was pipe buffer size since I found the following:

http://blog.dataart.com/linux-pipes-tips-tricks/

The size of the buffer since kernel version 2.6.11 is 65536 bytes (64K) and is equal to the page memory in older kernels. When attempting to read from an empty buffer, the read process is blocked until data appears.
Similarly, if you attempt to write to a full buffer, the recording process will be blocked until the necessary amount of space is available.

http://ffmpeg.gusari.org/viewtopic.php?f=12&t=624 [link now dead]

Poster1: What causes these circular buffer overruns? My assumption is that ffmpeg is reading the input stream into the aforementioned circular buffer, and the code then generates the output stream also reads from that same buffer. The overrun would happen when the code that generates the output doesn't keep up with the rate at which it's being written to the buffer, right?
Poster2: Looking at the source code it appears that the buffer gets overflowed either by too fast input or too slow output (slow cpu?). Your assumption is correct.

So theory was that our binary doesn't read pipe fast enough. As a result pipe gets blocked, and ffmpeg cannot write to it, and THAT results in udp fifo buffer overrun (ffmpeg keeps reading udp INTO FIFO, but cannot write FROM it into our pipe).

I managed to prove this theory by running (in separate terminals):

mkfifo mypipe
ffmpeg -loglevel debug -i "udp://192.168.15.50:3200?fifo_size=1000000&overrun_nonfatal=1" -r 8 -preset ultrafast -fflags nobuffer -vf scale=432:243 -f image2pipe -vcodec ppm pipe:1 > mypipe
cat < mypipe > /dev/null # run this for 10 seconds, allowing ffmpeg to start. then pause it with CTRL-Z and see ffmpeg crashing because it cannot read more udp stream

Next was investigating why our binary, at some point, stops reading the pipe. There seemed to be no reason, because normally it would just read into memory immediately after something comes to pipe.

However, it was also saving frames to hard drive, and at SOME POINT (sometimes 12 minutes, sometimes 15 hours), disk operations would slow down due to read/write operations (it was bcache (SSD and HDD hybrid, using SSD as cache)). I caught this fact randomly when I was removing few millions of files from this drive in parallel for debugging.

So, writing files to busy hard drive would temporarily block our binary from reading the input pipe.

The reason for udp circular buffer overrun issue and eventual timeshift was a HDD, and a theoretical solution is SSD.

This investigation took about 3 weeks, so posting all this in hope it will at least in part, help someone in future.

update:

I also detected another bottleneck causing this same issue later on (replacing HDD was not enough), which was tcp socket buffer overflow caused by postgres insertions on the backend.

The entire pipeline looks like that:

udp_videostream -> ffmpeg -> linux_pipe -> our_client_side_binary -> tcp -> our_server_side_binary -> postgres

Postgres queries were sometimes slow, which was causing our server to read TCP socket slower than our_binary was pushing to it. As a result, tcp socket would get blocked (it was maximum 4Mb), and as a result of that, client would block its input pipe, and as a result of that ffmpeg would crash with this CBO error.