Jitters/Jumps in Camera Pose Estimation Using ArUco Markers

I'm working on a project to track camera pose using static ArUco markers. Markers are placed all over a room and we know their pose. The camera moves and sees the markers and estimates their pose., using this and the actual pose of the marker we find the estimated pose of the camera. I've implemented a solution using OpenCV and Python, but I'm experiencing significant jitters or jumps in the estimated camera pose, especially when the camera sees different markers in consecutive frames.

Here is the code:

import cv2
import numpy as np
import matplotlib.pyplot as plt

# Intrinsic camera parameters
camera_matrix = np.array([
    [522.67889404, 0, 648.63323975],
    [0, 522.67889404, 365.60263062],
    [0, 0, 1]
])

# Distortion coefficients
dist_coeffs = np.zeros(5)

# Marker size in meters
marker_size = 0.03

# Known ArUco marker poses in world coordinates
marker_poses_world = {
    1: {'position': [-0.0366, 0.1280, 0.2941], 'rvec': np.array([0, -np.pi/2, 0], dtype=np.float32)},
    # ... (other markers)
}

# Load the predefined dictionary
aruco_dict = cv2.aruco.getPredefinedDictionary(cv2.aruco.DICT_6X6_250)
parameters = cv2.aruco.DetectorParameters()
detector = cv2.aruco.ArucoDetector(aruco_dict, parameters)

# Function to convert rvec and tvec to a transformation matrix
def pose_to_transformation(rvec, tvec):
    rotation_matrix, _ = cv2.Rodrigues(rvec)
    transformation_matrix = np.eye(4)
    transformation_matrix[:3, :3] = rotation_matrix
    transformation_matrix[:3, 3] = tvec.flatten()
    return transformation_matrix

# Open video input
video_path = "test2.mp4"
cap = cv2.VideoCapture(video_path)

paused = False

try:
    while cap.isOpened():
        if not paused:
            ret, frame = cap.read()
            if not ret:
                break

            gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
            corners, ids, rejected = detector.detectMarkers(gray)

            if ids is not None and len(ids) > 0:
                rvecs, tvecs, _ = cv2.aruco.estimatePoseSingleMarkers(
                    corners, marker_size, camera_matrix, dist_coeffs)

                for i in range(len(ids)):
                    marker_id = ids[i][0]
                    print(f"Detected Marker ID: {marker_id}")

                    cv2.aruco.drawDetectedMarkers(frame, corners, ids)
                    cv2.drawFrameAxes(frame, camera_matrix, dist_coeffs, rvecs[i], tvecs[i], 0.1)

                    T_cam_marker = pose_to_transformation(rvecs[i], tvecs[i])
                    print(f"Marker {marker_id} Transformation Matrix (Camera Frame):\n", T_cam_marker)

                    if marker_id in marker_poses_world:
                        marker_data = marker_poses_world[marker_id]
                        T_world_marker = np.eye(4)
                        T_world_marker[:3, 3] = marker_data['position']
                        R_world_marker, _ = cv2.Rodrigues(marker_data['rvec'])
                        T_world_marker[:3, :3] = R_world_marker
                        T_world_cam = T_world_marker @ np.linalg.inv(T_cam_marker)

                        camera_position_world_meters = T_world_cam[:3, 3]
                        camera_position_world_cm = camera_position_world_meters * 100

                        print(f"Camera Position (World): X: {camera_position_world_cm[0]:.2f}cm, Z: {camera_position_world_cm[2]:.2f}cm")
                        print("Camera Transformation Matrix:\n", T_world_cam)

            cv2.imshow("Video Input Pose Estimation", frame)

        key = cv2.waitKey(1) & 0xFF
        if key == ord('q'):
            break
        elif key == ord(' '):
            paused = not paused
finally:
    cap.release()
    cv2.destroyAllWindows()

Despite the implementation, the camera pose estimates still exhibit significant jitters, especially when different markers are detected in consecutive frames.

Here are some pictures of my video and code output map:

Image Frame, Image frame 2, Output 2D map

Questions:

  1. Are there any additional techniques or filters I can apply to further smooth the camera pose estimates?
  2. Is there a better way to handle the transition between different markers to reduce jumps in the estimated pose?
  3. Would implementing a Kalman filter help in this scenario, and if so, how can I integrate it with the existing code?

Any insights or suggestions would be greatly appreciated!

Upvotes: 0

Views: 41

Answers (1)

Francesco Callari
Francesco Callari

Reputation: 11825

Generally speaking, you should not expect independent point-in-time estimates of anything to be temporally smooth, because you are not asking for that in your estimation model.

Suggest you read up on Kalman Filter and/or Extended Kalman Filter.

Upvotes: 0

Related Questions