Reputation: 1
So the setting is given a coordinate system shown below where the z
-axis is pointing out of the screen (towards you), the camera focal length is 270
pixels and image resolution is 640x480
, then we have an object somewhere in 3D space, and two drones d1
and d2
taking two observations at two different viewpoints, where is d1
is at (6, 3, 2)
and the corresponding image coordinate of the object is (320, 280)
, and that of d2
is (9.5, 4.5, 3)
and (160, 408)
, also the heading of d1
is -20
degrees from the y
-axis and that of d2
is +30
degrees from y
-axis, the goal is to determine (x, y, z)
where the object is at, the drones are hovering over the xy
plane
Click Here for Image Illustration
Given the information, by letting d1
be the reference frame, we can have the camera intrinsics K = [[270, 0, 320], [0, 270, 240], [0, 0, 1]]
, the transformation is rotate +50
degrees with z
-axis as the rotation axis, and translation t = [3.5, 1.5, 1]
, therefore my code
import numpy as np
import cv2
def pixel2cam(pt, K):
u = (pt[0] - K[0][2]) / K[0][0]
v = (pt[1] - K[1][2]) / K[1][1]
return np.array([u, v], dtype=np.float32)
def triangulate(points_1, points_2, K, R, t):
cam_pts_1 = pixel2cam(points_1, K).reshape(2, 1)
cam_pts_2 = pixel2cam(points_2, K).reshape(2, 1)
T1 = np.array([[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]], dtype=np.float32)
T2 = np.hstack((R, t))
X = cv2.triangulatePoints(T1, T2, cam_pts_1, cam_pts_2)
X /= X[3]
return X
K = np.array([[270, 0, 320], [0, 270, 240], [0, 0, 1]], dtype=np.float32)
# rotate +50 degrees along z axis
R = np.array([[0.643, -0.766, 0], [0.766, 0.643, 0], [0, 0, 1]], dtype=np.float32)
t = np.array([[3.5], [1.5], [1]], dtype=np.float)
pt_1 = (320, 280)
pt_2 = (160, 408)
X = triangulate(pt_1, pt_2, K, R, t)
and that gives you a homogeneous coordinate X = [[-2.4155867], [ -5.1455526], [-12.032189], [1.]])
where z
is negative, so my question is
R
and t
here?OpenCV
?Any help is appreaciated!
Upvotes: 0
Views: 2571
Reputation: 809
I see multiple issues in your code. I'll try to examine them one by one in the different sections below.
Firstly, the OpenCV cv2.triangulatePoints()
takes the projection matrix from world to pixel coordinates, and the pixel coordinates of your world point on the images. See the cv2.triangulatePoints()
documentation. You can also read the maths behind OpenCV projection in 'Detailled description' section of this page.
Here is the corrected version of your triangulate()
function:
def triangulate(points_1, points_2, K, R, t):
T1 = np.array([[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]], dtype=np.float32)
T2 = np.hstack((R, t))
proj1 = np.matmul(K, T1)
proj2 = np.matmul(K, T2)
X = cv2.triangulatePoints(proj1, proj2, points_1, points_2)
X /= X[3]
return X
In OpenCV coordinates system, the z-axis of the camera is the optical axis (see diagram in this answer). Doing a rotation around z-axis doesn't match the image you linked in your question, a rotation of -50 degrees around the y-axis seems more correct.
You can recompute your rotation matrix using cv2.Rodrigues()
. Be aware that this function takes angles in radians, see the documentation.
rotation_vector = np.array([0, -50 / 180 * np.pi, 0])
R, _ = cv2.Rodrigues(rotation_vector)
You stated that your focal is in pixel units. Looking at the data, I highly doubt that the focal length and translation values are consistent, at least if they are expressed in pixels. I see two possibilities:
To fix that make sure that the focal length and translation are all expressed in pixels or in the same distance unit. You can do the conversion from pixel to distance unit by diving by the pixel pitch of your camera, and do the conversion from distance unit to pixels by multiplying by the pixel pitch. The pixel pitch should be written in the datasheet of the camera model you are using.
Applying the three corrections described above gives me a X
vector that have a positive z coordinate. The value I obtain doesn't mean much though, since I am not sure of the correct value and units of the data.
Upvotes: 3