Saving the openGL context as a video output

activatedgeek picture activatedgeek · Sep 28, 2013 · Viewed 11.3k times · Source

I am currently trying to save the animation made in openGL to a video file. I have tried using openCV's videowriter but to no advantage. I have successfully been able to generate a snapshot and save it as bmp using the SDL library. If I save all snapshots and then generate the video using ffmpeg, that is like collecting 4 GB worth of images. Not practical. How can I write video frames directly during rendering? Here the code i use to take snapshots when I require:

void snapshot(){
SDL_Surface* snap = SDL_CreateRGBSurface(SDL_SWSURFACE,WIDTH,HEIGHT,24, 0x000000FF, 0x0000FF00, 0x00FF0000, 0);
char * pixels = new char [3 *WIDTH * HEIGHT];
glReadPixels(0, 0,WIDTH, HEIGHT, GL_RGB, GL_UNSIGNED_BYTE, pixels);

for (int i = 0 ; i <HEIGHT ; i++)
    std::memcpy( ((char *) snap->pixels) + snap->pitch * i, pixels + 3 * WIDTH * (HEIGHT-i - 1), WIDTH*3 );

delete [] pixels;
SDL_SaveBMP(snap, "snapshot.bmp");
SDL_FreeSurface(snap);
}

I need the video output. I have discovered that ffmpeg can be used to create videos from C++ code but have not been able to figure out the process. Please help!

EDIT : I have tried using openCV CvVideoWriter class but the program crashes ("segmentation fault") the moment it is declared.Compilation shows no errors ofcourse. Any suggestions to that?

SOLUTION FOR PYTHON USERS (Requires Python2.7,python-imaging,python-opengl,python-opencv, codecs of format you want to write to, I am on Ubuntu 14.04 64-bit):

def snap():
    pixels=[]
    screenshot = glReadPixels(0,0,W,H,GL_RGBA,GL_UNSIGNED_BYTE)
    snapshot = Image.frombuffer("RGBA",W,H),screenshot,"raw","RGBA",0,0)
    snapshot.save(os.path.dirname(videoPath) + "/temp.jpg")
    load = cv2.cv.LoadImage(os.path.dirname(videoPath) + "/temp.jpg")
    cv2.cv.WriteFrame(videoWriter,load)

Here W and H are the window dimensions (width,height). What is happening is I am using PIL to convert the raw pixels read from the glReadPixels command into a JPEG image. I am loading that JPEG into the openCV image and writing to the videowriter. I was having certain issues by directly using the PIL image into the videowriter (which would save millions of clock cycles of I/O), but right now I am not working on that. Image is a PIL module cv2 is a python-opencv module.

Answer

Andon M. Coleman picture Andon M. Coleman · Sep 28, 2013

It sounds as though you are using the command line utility: ffmpeg. Rather than using the command-line to encode video from a collection of still images, you should use libavcodec and libavformat. These are the libraries upon which ffmpeg is actually built, and will allow you to encode video and store it in a standard stream/interchange format (e.g. RIFF/AVI) without using a separate program.

You probably will not find a lot of tutorials on implementing this because it has traditionally been the case that people wanted to use ffmpeg to go the other way; that is, decode various video formats for display in OpenGL. I think this is going to change very soon with the introduction of gameplay video encoding to the PS4 and Xbox One consoles, suddenly demand for this functionality will skyrocket.

The general process is this, however:

  1. Pick a container format and CODEC
    • Often one will decide the other, (e.g. MPEG-2 + MPEG Program Stream)
  2. Start filling a buffer with your still frames
  3. Periodically encode your buffer of still frames and write to your output (packet writing in MPEG terms)
    • You will do this either when the buffer becomes full, or every n-many ms; you might prefer one over the other depending on whether you want to stream your video live or not.
  4. When your program terminates flush the buffer and close your stream

One nice thing about this is you do not actually need to write to a file. Since you are periodically encoding packets of data from your buffer of still frames, you can stream your encoded video over a network if you want - this is why codec and container (interchange) format are separate.

Another nice thing is you do not have to synchronize the CPU and GPU, you can setup a pixel buffer object and have OpenGL copy data into CPU memory a couple of frames behind the GPU. This makes real-time encoding of video much less demanding, you only have to encode and flush the video to disk or over the network periodically if video latency demands are not unreasonable. This works very well in real-time rendering, since you have a large enough pool of data to keep a CPU thread busy encoding at all times.

Encoding frames can even be done in real-time on the GPU provided enough storage for a large buffer of frames (since ultimately the encoded data has to be copied from GPU to CPU and you want to do this as infrequently as possible). Obviously this is not done using ffmpeg, there are specialized libraries using CUDA / OpenCL / compute shaders for this purpose. I have never used them, but they do exist.

For portability sake, you should stick with libavcodec and Pixel Buffer Objects for asynchronous GPU->CPU copy. CPUs these days have enough cores that you can probably get away without GPU-assisted encoding if you buffer enough frames and encode in multiple simultaneous threads (this creates added synchronization overhead and increased latency when outputting encoded video) or simply drop frames / lower resolution (poor man's solution).

There are a lot of concepts covered here that go well beyond the scope of SDL, but you did ask how to do this with better performance than your current solution. In short, use OpenGL Pixel Buffer Objects to transfer data, and libavcodec for encoding. An example application that encodes video can be found on the ffmpeg libavcodec examples page.