I have a rectangular target of known dimensions and location on a wall, and a mobile camera on a robot. As the robot is driving around the room, I need to locate the target and compute the location of the camera and its pose. As a further twist, the camera's elevation and azimuth can be changed using servos. I am able to locate the target using OpenCV, but I am still fuzzy on calculating the camera's position (actually, I've gotten a flat spot on my forehead from banging my head against a wall for the last week). Here is what I am doing:
I have read the OpenCV book, but I guess I'm just missing something on how to use the projected points, rotation and translation vectors to compute the world coordinates of the camera and its pose (I'm not a math wiz) :-(
2013-04-02 Following the advice from "morynicz", I have written this simple standalone program.
#include <Windows.h>
#include "opencv\cv.h"
using namespace cv;
int main (int argc, char** argv)
{
const char *calibration_filename = argc >= 2 ? argv [1] : "M1011_camera.xml";
FileStorage camera_data (calibration_filename, FileStorage::READ);
Mat camera_intrinsics, distortion;
vector<Point3d> world_coords;
vector<Point2d> pixel_coords;
Mat rotation_vector, translation_vector, rotation_matrix, inverted_rotation_matrix, cw_translate;
Mat cw_transform = cv::Mat::eye (4, 4, CV_64FC1);
// Read camera data
camera_data ["camera_matrix"] >> camera_intrinsics;
camera_data ["distortion_coefficients"] >> distortion;
camera_data.release ();
// Target rectangle coordinates in feet
world_coords.push_back (Point3d (10.91666666666667, 10.01041666666667, 0));
world_coords.push_back (Point3d (10.91666666666667, 8.34375, 0));
world_coords.push_back (Point3d (16.08333333333334, 8.34375, 0));
world_coords.push_back (Point3d (16.08333333333334, 10.01041666666667, 0));
// Coordinates of rectangle in camera
pixel_coords.push_back (Point2d (284, 204));
pixel_coords.push_back (Point2d (286, 249));
pixel_coords.push_back (Point2d (421, 259));
pixel_coords.push_back (Point2d (416, 216));
// Get vectors for world->camera transform
solvePnP (world_coords, pixel_coords, camera_intrinsics, distortion, rotation_vector, translation_vector, false, 0);
dump_matrix (rotation_vector, String ("Rotation vector"));
dump_matrix (translation_vector, String ("Translation vector"));
// We need inverse of the world->camera transform (camera->world) to calculate
// the camera's location
Rodrigues (rotation_vector, rotation_matrix);
Rodrigues (rotation_matrix.t (), camera_rotation_vector);
Mat t = translation_vector.t ();
camera_translation_vector = -camera_rotation_vector * t;
printf ("Camera position %f, %f, %f\n", camera_translation_vector.at<double>(0), camera_translation_vector.at<double>(1), camera_translation_vector.at<double>(2));
printf ("Camera pose %f, %f, %f\n", camera_rotation_vector.at<double>(0), camera_rotation_vector.at<double>(1), camera_rotation_vector.at<double>(2));
}
The pixel coordinates I used in my test are from a real image that was taken about 27 feet left of the target rectangle (which is 62 inches wide and 20 inches high), at about a 45 degree angle. The output is not what I'm expecting. What am I doing wrong?
Rotation vector
2.7005
0.0328
0.4590
Translation vector
-10.4774
8.1194
13.9423
Camera position -28.293855, 21.926176, 37.650714
Camera pose -2.700470, -0.032770, -0.459009
Will it be a problem if my world coordinates have the Y axis inverted from that of OpenCV's screen Y axis? (the origin of my coordinate system is on the floor to the left of the target, while OpenCV's orgin is the top left of the screen).
What units is the pose in?
You get the translation and rotation vectors from solvePnP
, which are telling where is the object in camera's coordinates. You need to get an inverse transform.
The transform camera -> object can be written as a matrix [R T;0 1]
for homogeneous coordinates. The inverse of this matrix would be, using it's special properties, [R^t -R^t*T;0 1]
where R^t is R transposed. You can get R matrix from Rodrigues transform. This way You get the translation vector and rotation matrix for transformation object->camera coordiantes.
If You know where the object lays in the world coordinates You can use the world->object transform * object->camera transform matrix to extract cameras translation and pose.
The pose is described either by single vector or by the R matrix, You surely will find it in Your book. If it's "Learning OpenCV" You will find it on pages 401 - 402 :)
Looking at Your code, You need to do something like this
cv::Mat R;
cv::Rodrigues(rotation_vector, R);
cv::Mat cameraRotationVector;
cv::Rodrigues(R.t(),cameraRotationVector);
cv::Mat cameraTranslationVector = -R.t()*translation_vector;
cameraTranslationVector
contains camera coordinates. cameraRotationVector
contains camera pose.