How do I construct a 3D model of a room from 2 stereo cameras? What is the determining factor to an accurate construction?

yasumi picture yasumi · Jun 18, 2010 · Viewed 11.9k times · Source

Currently, I have extracted depth points to construct a 3D model from 2 stereo cameras. The methods I have used are openCV graphCut method and a software from http://sourceforge.net/projects/reconststereo/. However, the generated 3D models are not very accurate, which leads me to question: 1) What is the problem with pixel-based method? 2) Should I change my pixel-based method to feature-based or object-recognition-based method? Is there a best method? 3) Are there any other ways to do such reconstruction?

Additionally, the depth extracted comes only from 2 images. What if I am turning the camera 360 degrees to obtain a video? Looking forward to suggestion on how to combine this depth information.

Thank you very much :)

Answer

Roman Shapovalov picture Roman Shapovalov · Jun 22, 2010

The key problem that defines the accuracy of stereo reconstruction is disparity estimation. This area has been investigated extensively, but state-of-the-art results are collected on the page: http://vision.middlebury.edu/stereo/eval/ I recommend you to pick up one of the top methods. Probably you will need to implement it by yourself (references to the papers are in the bottom of the page), or try to find an implementation on the homepages of the authors. Also look at http://vision.middlebury.edu/MRF/code/ .

You should also try to figure out the reason of low accuracy. It may be inability of the algorithm to capture the structure of a scene, or just low resolution of an output. In the latter case you need to go to the sub-pixel accuracy. The number of methods address this problem. Use the Error Threshold combo-box to rank the algorithms according to the desired precision.

Multiple cameras could help as well. Keywords are "multi-view stereo".