OpenCV: solvePnP tvec units and axes directions

Aleksander Lidtke picture Aleksander Lidtke · Jul 2, 2013 · Viewed 10.7k times · Source

I'm trying to find the relative position of the camera to the chessboard (or the other way around) - I feel OK with converting between different coordinate systems, e.g. as suggested here. I decided to use chessboard not only for calibration but actual position determination as well at this stage, since I can use the findChessboardCorners to get the imagePoints (and this works OK).

I've read a lot on this topic and feel that I understand the solvePnP outputs (even though I'm completely new to openCV and computer vision in general). Unfortunately, the results I get from solvePnP and physically measuring the test set-up are different: translation in z-direction is off by approx. 25%. x and y directions are completely wrong - several orders of magnitude and different direction than what I've read to be the camera coordinate system (x pointing up the image, y to the right, z away from the camera). The difference persists if I convert tvec and rvec to camera pose in world coordinates.

My questions are:

  • What are the directions of camera and world coordinate systems' axes?
  • Does solvePnP output the translation in the same units as I specify the objectPoints?
  • I specified the world origin as the first of the objectPoints (one of the chessboard corners). Is that OK and is tvec the translation to exactly that point from the camera coordinates?

This is my code (I attach it pro forma as it does not throw any exceptions etc.). I used grayscale images to get the camera intrinsics matrix and distortion coefficients during calibration so decided to perform localisation in grayscale as well. chessCoordinates is a list of chessboard points location in mm with respect to the origin (one of the corner points). camMatrix and distCoefficients come from calibration (performed using the same chessboard and objectPoints).

camCapture=cv2.VideoCapture(0) # Take a picture of the target to get the imagePoints 
tempImg=camCapture.read()
imgPts=[]
tgtPts=[]

tempImg=cv2.cvtColor(tempImg[1], cv2.COLOR_BGR2GRAY)
found_all, corners = cv2.findChessboardCorners(tempImg, chessboardDim )

imgPts.append(corners.reshape(-1, 2))
tgtPts.append(np.array(chessCoordinates, dtype=np.float32))

retval,myRvec,myTvec=cv2.solvePnP(objectPoints=np.array(tgtPts), imagePoints=np.array(imgPts), cameraMatrix=camMatrix, distCoeffs=distCoefficients)

Answer

morynicz picture morynicz · Jul 2, 2013

The camera coordinates are the same as image coordinates. So You have x axe pointing in the right side from the camera, y axe pointing down, and z pointing in the direction camera is faced. This is a clockwise axe system, and the same would apply to the chessboard, so if You specified the origin in, lets say, upper right corner of the chessboard, x axe goes along the longer side to the right and y along shorter side of the chessboard, z axe would be pointing downward, to the ground.

Solve PnP outputs the translation in the same units as the units in which You specified the length of chessboard fields, but it might also use units specified in camera calibration, as it uses the camera matrix.

Tvec points to the origin of the world coordinates in which You placed the calibration object. So if You placed the first object point in (0,0), thats where tvec will point to.