Carpetfizz
Carpetfizz

Reputation: 9149

OpenCV recoverPose camera coordinate system

I'm estimating the translation and rotation of a single camera using the following code.

E, mask = cv2.findEssentialMat(k1, k2, 
                         focal = SCALE_FACTOR * 2868
                         pp = (1920/2 * SCALE_FACTOR, 1080/2 * SCALE_FACTOR), 
                         method = cv2.RANSAC, 
                         prob = 0.999, 
                         threshold = 1.0)

points, R, t, mask = cv2.recoverPose(E, k1, k2)

where k1 and k2 are my matching set of key points, which are Nx2 matrices where the first column is the x-coordinates and the second column is y-coordinates.

I collect all the translations over several frames and generate a path that the camera traveled like this.

def generate_path(rotations, translations):
    path = []
    current_point = np.array([0, 0, 0])

    for R, t in zip(rotations, translations):
        path.append(current_point)
        # don't care about rotation of a single point
        current_point = current_point + t.reshape((3,)

    return np.array(path)

So, I have a few issues with this.

  1. The OpenCV camera coordinate system suggests that if I want to view the 2D "top down" view of the camera's path, I should plot the translations along the X-Z plane.

plt.plot(path[:,0], path[:,2])

enter image description here

This is completely wrong.

However, if I write this instead

plt.plot(path[:,0], path[:,1])

I get the following (after doing some averaging)

enter image description here

This path is basically perfect. So, perhaps I am misunderstanding the coordinate system convention used by cv2.recoverPose? Why should the "birds eye view" of the camera path be along the XY plane and not the XZ plane?

  1. Another, perhaps unrelated issue is that the reported Z-translation appears to decrease linearly, which doesn't really make sense.

enter image description here

I'm pretty sure there's a bug in my code since these issues appear systematic - but I wanted to make sure my understanding of the coordinate system was correct so I can restrict the search space for debugging.

Upvotes: 3

Views: 4399

Answers (1)

Rengao Zhou
Rengao Zhou

Reputation: 106

At the very beginning, actually, your method is not producing a real path. The translation t produced by recoverPose() is always a unit vector. Thus, in your 'path', every frame is moving exactly 1 'meter' from the previous frame. The correct method would be, 1) initialize:(featureMatch, findEssentialMatrix, recoverPose), then 2) track:(triangluate, featureMatch, solvePnP). If you would like to dig deeper, finding tutorials on Monocular Visual SLAM would help.

Secondly, you might have messed up with the camera coordinate system and world coordinate system. If you want to plot the trajectory, you would use the world coordinate system rather than camera coordinate system. Besides, the results of recoverPose() are also in world coordinate system. And the world coordinate system is: x-axis pointing to right, y-axis pointing forward, z-axix pointing up.Thus, when you would like to plot the 'bird view', it is correct that you should plot along the X-Y plane.

Upvotes: 2

Related Questions