Real-time video(image) stitching

SilentButDeadly JC picture SilentButDeadly JC · May 2, 2012 · Viewed 17k times · Source

I'm thinking of stitching images from 2 or more(currently maybe 3 or 4) cameras in real-time using OpenCV 2.3.1 on Visual Studio 2008.

However, I'm curious about how it is done.

Recently I've studied some techniques of feature-based image stitching method.

Most of them requires at least the following step:

1.Feature detection 2.Feature matching 3.Finding Homography 4.Transformation of target images to reference images ...etc

Now most of the techniques I've read only deal with images "ONCE", while I would like it to deal with a series of images captured from a few cameras and I want it to be "REAL-TIME".

So far it may still sound confusing. I'm describing the detail:

Put 3 cameras at different angles and positions, while each of them must have overlapping areas with its adjacent one so as to build a REAL-TIME video stitching.

What I would like to do is similiar to the content in the following link, where ASIFT is used.

http://www.youtube.com/watch?v=a5OK6bwke3I

I tried to consult the owner of that video but I got no reply from him:(.

Can I use image-stitching methods to deal with video stitching? Video itself is composed of a series of images so I wonder if this is possible. However, detecting feature points seems to be very time-consuming whatever feature detector(SURF, SIFT, ASIFT...etc) you use. This makes me doubt the possibility of doing Real-time video stitching.

Answer

wcochran picture wcochran · May 16, 2012

I have worked on a real-time video stitching system and it is a difficult problem. I can't disclose the full solution we used due to an NDA, but I implemented something similar to the one described in this paper. The biggest problem is coping with objects at different depths (simple homographies are not sufficient); depth disparities must be determined and the video frames appropriately warped so that common features are aligned. This essentially is a stereo vision problem. The images must first be rectified so that common features appear on the same scan line.