Python Opencv SolvePnP yields wrong translation vector

amartin7211 picture amartin7211 · Jan 25, 2013 · Viewed 12.5k times · Source

I am attempting to calibrate and find the location and rotation of a single virtual camera in Blender 3d using homography. I am using Blender so that I can double check my results before I move on to the real world where that is more difficult.

I rendered ten pictures of a chess board in various locations and rotations in the view of my stationary camera. With OpenCV's Python, I used cv2.calibrateCamera to find the intrinsic matrix from the detected corners of the chess board in the ten images and then used that in cv2.solvePnP to find the extrinsic parameters(translation and rotation).

However, though the estimated parameters were close to the actual ones, there is something fishy going on. My initial estimation of the translation was (-0.11205481,-0.0490256,8.13892491). The actual location was (0,0,8.07105). Pretty close right?

But, when I moved and rotated the camera slightly and rerendered the images, the estimated translation became farther off. Estimated: (-0.15933154,0.13367286,9.34058867). Actual: (-1.7918,-1.51073,9.76597). The Z value is close, but the X and the Y are not.

I am utterly confused. If anybody can help me sort through this, I would be highly grateful. Here is the code (it's based off of the Python2 calibrate example supplied with OpenCV):

#imports left out
USAGE = '''
USAGE: calib.py [--save <filename>] [--debug <output path>] [--square_size] [<image mask>]
'''   

args, img_mask = getopt.getopt(sys.argv[1:], '', ['save=', 'debug=', 'square_size='])
args = dict(args)
try: img_mask = img_mask[0]
except: img_mask = '../cpp/0*.png'
img_names = glob(img_mask)
debug_dir = args.get('--debug')
square_size = float(args.get('--square_size', 1.0))

pattern_size = (5, 8)
pattern_points = np.zeros( (np.prod(pattern_size), 3), np.float32 )
pattern_points[:,:2] = np.indices(pattern_size).T.reshape(-1, 2)
pattern_points *= square_size

obj_points = []
img_points = []
h, w = 0, 0
count = 0
for fn in img_names:
    print 'processing %s...' % fn,
    img = cv2.imread(fn, 0)
    h, w = img.shape[:2]
    found, corners = cv2.findChessboardCorners(img, pattern_size)        

    if found:
        if count == 0:
            #corners first is a list of the image points for just the first image.
            #This is the image I know the object points for and use in solvePnP
            corners_first =  []
            for val in corners:
                corners_first.append(val[0])                
            np_corners_first = np.asarray(corners_first,np.float64)                
        count+=1
        term = ( cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_COUNT, 30, 0.1 )
        cv2.cornerSubPix(img, corners, (5, 5), (-1, -1), term)
    if debug_dir:
        vis = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
        cv2.drawChessboardCorners(vis, pattern_size, corners, found)
        path, name, ext = splitfn(fn)
        cv2.imwrite('%s/%s_chess.bmp' % (debug_dir, name), vis)
    if not found:
        print 'chessboard not found'
        continue
    img_points.append(corners.reshape(-1, 2))
    obj_points.append(pattern_points)        

    print 'ok'

rms, camera_matrix, dist_coefs, rvecs, tvecs = cv2.calibrateCamera(obj_points, img_points, (w, h))
print "RMS:", rms
print "camera matrix:\n", camera_matrix
print "distortion coefficients: ", dist_coefs.ravel()    
cv2.destroyAllWindows()    

np_xyz = np.array(xyz,np.float64).T #xyz list is from file. Not shown here for brevity
camera_matrix2 = np.asarray(camera_matrix,np.float64)
np_dist_coefs = np.asarray(dist_coefs[:,:],np.float64)    

found,rvecs_new,tvecs_new = cv2.solvePnP(np_xyz, np_corners_first,camera_matrix2,np_dist_coefs)

np_rodrigues = np.asarray(rvecs_new[:,:],np.float64)
print np_rodrigues.shape
rot_matrix = cv2.Rodrigues(np_rodrigues)[0]

def rot_matrix_to_euler(R):
    y_rot = asin(R[2][0]) 
    x_rot = acos(R[2][2]/cos(y_rot))    
    z_rot = acos(R[0][0]/cos(y_rot))
    y_rot_angle = y_rot *(180/pi)
    x_rot_angle = x_rot *(180/pi)
    z_rot_angle = z_rot *(180/pi)        
    return x_rot_angle,y_rot_angle,z_rot_angle

print "Euler_rotation = ",rot_matrix_to_euler(rot_matrix)
print "Translation_Matrix = ", tvecs_new

Answer

fraxel picture fraxel · Jan 27, 2013

I think you may be thinking of tvecs_new as the camera position. Slightly confusingly that is not the case! In fact it is the position of the world origin in camera co-ords. To get the camera pose in the object/world co-ords, I believe you need to

-np.matrix(rotation_matrix).T * np.matrix(tvecs_new)

And you can get the Euler angles using cv2.decomposeProjectionMatrix(P)[-1] where P is the [r|t] 3 by 4 extrinsic matrix.

I found this to be a pretty good article about the intrinsics and extrinsics...