Reputation: 11
Currently, I am working on Point of Gaze Estimation where I need to convert the gaze direction(yaw, pitch) predicted by gaze model trained on Gaze360 dataset to actual 3D point where the eye was looking at. I went through the Gaze360 paper and I understood how they convert Eye Gaze from Camera System to Eye coordinate System given target point in Camera 3D space. However, I couldn't figure out the inverse opearation given gaze output(yaw, pitch) in Eye coordinate system to Camera Coordinate system. Since, [Ex, Ey, Ez] depends on the gL which is Gaze direction in Camera coordinate system, finding this Rotation that relates the Eye coordinate system to Camera Coordinate system is daunting to me. Can you please help me find out Rotation that relates Eye Coordinate system to Camera Coordinate System during inference given the (yaw, pitch) predicted by the gaze estimation model trained on Gaze360 datasets?
I tried
import numpy as np
def normalize(v):
""" Normalize a vector. """
norm = np.linalg.norm(v)
if norm == 0:
return v
return v / norm
def compute_eye_to_camera_rotation_matrix(yaw, pitch, eye_position, camera_up_vector=np.array([0, 1, 0])):
"""
Compute the rotation matrix from the eye coordinate system to the camera coordinate system.
:param yaw: Yaw angle in radians (rotation around the Y-axis in the eye coordinate system)
:param pitch: Pitch angle in radians (rotation around the X-axis in the eye coordinate system)
:param eye_position: 3D position of the eye in the camera coordinate system (numpy array)
:param camera_up_vector: Up vector in the camera coordinate system (default is [0, 1, 0])
:return: Rotation matrix from eye coordinate system to camera coordinate system (3x3 numpy array)
"""
# Step 1: Compute E_z in the eye coordinate system
E_z = np.array([
np.cos(pitch) * np.cos(yaw),
np.sin(pitch),
np.cos(pitch) * np.sin(yaw)
])
E_z = normalize(E_z) # Normalize E_z to ensure it is a unit vector
# Step 2: Project the camera's up vector onto the plane perpendicular to E_z
camera_up_projection = camera_up_vector - np.dot(camera_up_vector, E_z) * E_z
E_x = normalize(camera_up_projection) # Normalize E_x to ensure it lies in the X,Y plane of the camera
# Step 3: Compute E_y by taking the cross product of E_z and E_x
E_y = np.cross(E_z, E_x)
E_y = normalize(E_y) # Normalize E_y to ensure it is a unit vector
# Step 4: Form the rotation matrix R_EC
R_EC = np.vstack([E_x, E_y, E_z])
return R_EC
def convert_gaze_to_camera_coordinate_system(yaw, pitch, eye_position):
"""
Convert gaze direction from the eye coordinate system to the camera coordinate system.
:param yaw: Yaw angle in radians (rotation around the Y-axis in the eye coordinate system)
:param pitch: Pitch angle in radians (rotation around the X-axis in the eye coordinate system)
:param eye_position: 3D position of the eye in the camera coordinate system (numpy array)
:return: Gaze direction in the camera coordinate system (3D numpy array)
"""
# Compute the gaze direction in the eye coordinate system
g_E = np.array([
np.cos(pitch) * np.cos(yaw),
np.sin(pitch),
np.cos(pitch) * np.sin(yaw)
])
# Compute the rotation matrix from eye to camera coordinate system
R_EC = compute_eye_to_camera_rotation_matrix(yaw, pitch, eye_position)
# Convert the gaze direction to the camera coordinate system
g_C = R_EC.T @ g_E # Transform gaze vector to camera coordinate system
return g_C
# Example usage:
yaw = np.radians(30) # 30 degrees yaw
pitch = np.radians(10) # 10 degrees pitch
eye_position = np.array([0, 0, 0]) # Assume eye is at the origin of the camera system for #simplicity
gaze_in_camera_system = convert_gaze_to_camera_coordinate_system(yaw, pitch, eye_position)
print("Gaze direction in camera coordinate system:", gaze_in_camera_system)
gaze_in_camera_system = convert_gaze_to_camera_coordinate_system(yaw, pitch, eye_position) print("Gaze direction in camera coordinate system:", gaze_in_camera_system)`
but this is incorrect because the method i used is usually correct if i come from gaze at Camera coordinate system to Eye Coordinate system.
Upvotes: 0
Views: 38