Visual Odometry - Problem with Rotation and translation

Question

I'm trying to reconstruct the trajectory of a camera by extracting the features from images and computing the camera pose, however the trajectory is not as I expected. For my visual odometry system I am using an Intel camera D435i which extracts color and depth frames. My code is written with python and all the important functions used are from the OpenCV library.

I am using a dataset where the camera just moves straightforward and therefore I expect the rotation to be the identity matrix and the translation to go forward. However, this isn't why I get when I compute them.

This is my approach:

Read the color frames, transform them to grayscale and apply CLAHE-method to reduce light contrasts
Extract features using SIFT from the frame t and frame t+1
Match them using FLANN and apply Lowe's ratio test to filter outliers
Transform the 2D pixel coordinates (u1p, v1p) of the features to 3D world coordinates using the camera intrinsic matrix (K) and the depth of the pixels (s) (Acquired from the depth frames):

Normalization: x_y_norm = (K^-1) * [u1p ,v1p, 1] Scaling: x_y_norm_s *= x_y_norm

Afterwards I try two methods: PNP and essential matrix

A) PNP: 4. Use the function cv2.solvePNPRansac() to get the rotation vector and the translation vector. The inputs of the function are the normalized and scaled keypoints of the frame t (object points) and the normalized keypoints of the frame t+1 (image points). The rotation matrix is then computed with the function cv2.Rodrigues()

B) Essential matrix: 4. Use cv2.findEssentialMat() with the normalized x-y-coordinates of the keypoints as input. Check if the essential matrix is computed correctly by checking the singular values. Use cv2.recoverPose() to get the rotation and translation (Here I also used the normalized x-y-coordinates as input).

Compute the trajectory as follows (assuming the second camera_pose is [R|t] and the first pose is the identity matrix as stated in the book multiple view geometry):

camera_pose[t+1] = pose[t] * pose[t+1]^-1 (Get pose from initial position by multiplying previous and current pose as stated in Combining relative camera rotations & translations with known pose )

position = pose[t+1] * [0,0,0,1] (Position calculated from the origin)

trajectory[:,t+1] = position[0:3]

I hope someone can help me since I don't know where my error is and neither the PnP method nor essential matrix method work.

Edit: For both methods I added a picture of the trajectory (in the Z-X and the X-Y plane) that I get from the first 15 frames of my dataset and an extract of a text file containing the rotation matrix and the translation vector of some frames. The Z-X-Plane should show dots forming a line in positive direction, since my dataset just shows images where the camera moves straightforward, however a random trajectory is shown. The green dots are old camera poses and the red dot the current one. I also added an example of the features extracted in the first frame and matched with the second frame, they seem too be alright.

Visual Odometry - Problem with Rotation and translation

Answers (0)

Related Questions