Computing x,y coordinate (3D) from image point

Banana picture Banana · Sep 6, 2012 · Viewed 27.6k times · Source

I have a task to locate an object in 3D coordinate system. Since I have to get almost exact X and Y coordinate, I decided to track one color marker with known Z coordinate that will be placed on the top of the moving object, like the orange ball in this picture: undistored

First, I have done the camera calibration to get intrinsic parameters and after that I used cv::solvePnP to get rotation and translation vector like in this following code:

std::vector<cv::Point2f> imagePoints;
std::vector<cv::Point3f> objectPoints;
//img points are green dots in the picture
imagePoints.push_back(cv::Point2f(271.,109.));
imagePoints.push_back(cv::Point2f(65.,208.));
imagePoints.push_back(cv::Point2f(334.,459.));
imagePoints.push_back(cv::Point2f(600.,225.));

//object points are measured in millimeters because calibration is done in mm also
objectPoints.push_back(cv::Point3f(0., 0., 0.));
objectPoints.push_back(cv::Point3f(-511.,2181.,0.));
objectPoints.push_back(cv::Point3f(-3574.,2354.,0.));
objectPoints.push_back(cv::Point3f(-3400.,0.,0.));

cv::Mat rvec(1,3,cv::DataType<double>::type);
cv::Mat tvec(1,3,cv::DataType<double>::type);
cv::Mat rotationMatrix(3,3,cv::DataType<double>::type);

cv::solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec);
cv::Rodrigues(rvec,rotationMatrix);

After having all matrices, this equation that can help me with transforming image point to wolrd coordinates:

transform_equation

where M is cameraMatrix, R - rotationMatrix, t - tvec, and s is an unknown. Zconst represents the height where the orange ball is, in this example it is 285 mm. So, first I need to solve previous equation, to get "s", and after I can find out X and Y coordinate by selecting image point: equation2

Solving this I can find out variable "s", using the last row in matrices, because Zconst is known, so here is the following code for that:

cv::Mat uvPoint = (cv::Mat_<double>(3,1) << 363, 222, 1); // u = 363, v = 222, got this point using mouse callback

cv::Mat leftSideMat  = rotationMatrix.inv() * cameraMatrix.inv() * uvPoint;
cv::Mat rightSideMat = rotationMatrix.inv() * tvec;

double s = (285 + rightSideMat.at<double>(2,0))/leftSideMat.at<double>(2,0)); 
//285 represents the height Zconst

std::cout << "P = " << rotationMatrix.inv() * (s * cameraMatrix.inv() * uvPoint - tvec) << std::endl;

After this, I got result: P = [-2629.5, 1272.6, 285.]

and when I compare it to measuring, which is: Preal = [-2629.6, 1269.5, 285.]

the error is very small which is very good, but when I move this box to the edges of this room, errors are maybe 20-40mm and I would like to improve that. Can anyone help me with that, do you have any suggestions?

Answer

Pascal L&#233;cuyot picture Pascal Lécuyot · Sep 6, 2012

Given your configuration, errors of 20-40mm at the edges are average. It looks like you've done everything well.

Without modifying camera/system configuration, doing better will be hard. You can try to redo camera calibration and hope for better results, but this will not improve them alot (and you may eventually get worse results, so don't erase actual instrinsic parameters)

As said by count0, if you need more precision you should go for multiple measurements.