I am trying to find car motions with two cameras on Carla but the results are meaningless

Question

I have two cameras in a fixed position on the vehicle and the features of these cameras are exactly the same. First, I record an image with these two cameras, which I call "initial". Then, after 1 frame or a certain period of time has passed, I record an image again and call these images "delay".

I perform feature extraction and matching with the ORB method on camera 1 initial and camera 2 initial images. I do the same process for camera 1 delay and camera 2 delay images.

I convert these matched points I find into 3D_points_inital points with the triangulation method for initial images. I do the same process for delay images and call it 3D_points_delay.

Finally, I put these points into a function called calculative_retalive_motion.

My whole purpose here is to obtain the rotation and translation matrices of the vehicle relative to the world after the vehicle movement. Is there a problem with the logic I use or is there a problem with my code? Because the values I want do not appear, I think they are too large or small. Unfortunately, I cannot obtain meaningful values. Below you can see the functions I used and the results I obtained.

def triangulate_points(self, points1, points2):
    # Define the constant translation vector between cameras
    # Horizontal distance between cameras is 1.0 meter
    T = np.array([0, 1.0, 0]).reshape(3, 1)

    # Camera parameters
    image_size_x = 640
    image_size_y = 480
    fov = 110  # In degrees

    # Calculate the intrinsic parameter matrix
    f = image_size_x / (2 * np.tan(fov * np.pi / 360))
    cx = image_size_x / 2
    cy = image_size_y / 2
    K = np.array([[f, 0, cx],
                  [0, f, cy],
                  [0, 0, 1]])

    # Create the rotation matrix and projection matrices
    R = np.eye(3)
    P1 = np.hstack((K, np.zeros((3, 1))))  # For the first camera [K|0]
    P2 = np.hstack((K, T))                # For the second camera [K|T] (R is an identity matrix here)

    # Triangulate the points
    try:
        points1 = points1.reshape(-1, 2).astype(float)
        points2 = points2.reshape(-1, 2).astype(float)
        points4D = cv2.triangulatePoints(P1, P2, points1.T, points2.T)
        points4D = points4D.astype(float)
        mask = points4D[3] != 0
        if np.any(mask):
            points4D[:, mask] /= points4D[3, mask]
            points3D = points4D[:3, mask].T
            print(f"Number of valid 3D points after triangulation: {len(points3D)}")
            return points3D
        else:
            print("Warning: No valid points after triangulation.")
            return None
    except cv2.error as e:
        print(f"Error in triangulate_points: {e}")
        return None

def calculate_relative_motion(self, points_3d_initial, points_3d_after_delay):
    if points_3d_initial is None or points_3d_after_delay is None:
        print("Error: Invalid input data.")
        return None, None

    if points_3d_initial.shape != points_3d_after_delay.shape:
        print("Error: Input point clouds must have the same dimensions.")
        return None, None

    # Center the points by subtracting their means
    mean_initial = np.mean(points_3d_initial, axis=0)
    mean_after_delay = np.mean(points_3d_after_delay, axis=0)

    centered_initial = points_3d_initial - mean_initial
    centered_after_delay = points_3d_after_delay - mean_after_delay

    # Calculate the transformation matrix using SVD
    H = np.dot(centered_initial.T, centered_after_delay)
    U, S, Vt = np.linalg.svd(H)
    R = np.dot(Vt.T, U.T)

    # Correct for reflection if needed
    if np.linalg.det(R) < 0:
       Vt[-1, :] *= -1
       R = np.dot(Vt.T, U.T)

    # Calculate the translation vector
    t = mean_after_delay.T - np.dot(R, mean_initial.T)

    # Create the transformation matrix
    relative_motion_matrix = np.identity(4)
    relative_motion_matrix[:3, :3] = R
    relative_motion_matrix[:3, 3] = t

    return relative_motion_matrix, t

I am using like this:

    # Find keypoints and matches for the initial images
    points1_initial, points2_initial = image_processor.find_keypoints_and_matches(img1_initial, img2_initial)

    # Find keypoints and matches for the images after delay
    points1_after_delay, points2_after_delay = image_processor.find_keypoints_and_matches(img1_after_delay, img2_after_delay)

    # Triangulate 3D points
    points_3d_initial = image_processor.triangulate_points(points1_initial, points2_initial)
    points_3d_after_delay = image_processor.triangulate_points(points1_after_delay, points2_after_delay)

    # Calculate the relative motion between cameras
    relative_motion_matrix, translation_vector = image_processor.calculate_relative_motion(points_3d_initial, points_3d_after_delay)

I tried measurements but this time the results were small. I tried many different methods but it doesn't work, unfortunately I'm making a mistake somewhere.

Results are here

I am trying to find car motions with two cameras on Carla but the results are meaningless

Answers (0)

Related Questions