Reputation: 2360
I'm currently trying to solve the RGBD SLAM problem, but am experiencing some issues estimating poses via RANSAC. I have correctly transformed the points from 2d to 3d via:
def transform3d(x, y, depth):
Z = depth[x][y] / scalingFactor
X = (x - centerX) * Z / focalX
Y = (y - centerY) * Z / focalY
return (X,Y,Z)
def transform(matches, depth1, depth2, kp1, kp2):
points_3d, points_2d = [], []
temp = np.zeros((1, 2))
for mat in matches:
img1_idx = mat.queryIdx
img2_idx = mat.trainIdx
(y1, x1) = kp1[img1_idx].pt
(y2, x2) = kp2[img2_idx].pt
if depth[x1][y1] == 0:
continue
points_2d.append(kp2[img2_idx].pt)
points_3d.append(np.array(transform3d(x1, y1, depth)))
return (np.array(points_3d, np.float32), np.array(points_2d, np.float32))
afterwards I call the calibrateCamera function to retrieve the distortion param
mtx = np.array([[focalX, 0, centerX], [0, focalY, centerY], [0, 0, 1]], np.float32)
cv2.calibrateCamera(np.array([points_3d]), np.array([points_2d]), rgb1.shape[::-1], None, None, flags=1)
and did RANSAC, to obtain the rotation and translation matrix:
cv2.solvePnPRansac(np.array([points_3d]), np.array([points_2d]), mtx, dist)
For the above I went through OpenCVs tutorial for estimating poses.
I have also followed this article http://ksimek.github.io/2012/08/22/extrinsic/ and tried to express the pose by doing
R = cv2.Rodrigues(rvecs)[0].T
pose = -R*tvecs
my poses are definitely wrong! yet I have no idea where the issue lies.
I have also cross-checked my code with this C++ implementation of RGBD SLAM http://www.cnblogs.com/gaoxiang12/p/4659805.html
Please help! I really want to get my robot moving :)
Upvotes: 3
Views: 3212
Reputation: 1
I think you should check the order of the (x, y). You mixed the meaning of the (x, y) returned from the keypoint and the index used to access the value of the depth images.
Upvotes: 0
Reputation: 86
First you should probably avoid to call calibrateCamera at every step. This should be done only once from a calibration pattern like a chessboard. This process of calibration should be independent of your main program, you do it once for your camera and you stick with those parameters as long as you trust them. You can find existing programs to evaluate those parameters. If you want to start quickly with something, you can enter a theoretical value for the focal length (approximate value for that type of camera given by the manufacturer). You can also assume a perfect camera with ideal cx and cy in the center of the image. This will give you a rough estimation of poses but not completely wrong. Then you can refine it later with better calibration values.
For the rest of the code, there might be an error here:
points_2d.append(kp2[img2_idx].pt)
points_3d.append(np.array(transform3d(x1, y1, depth)))
It seems you mix points from set2 (2d) and set1 (3d) so it doesn't look consistent.
Hope it helps.
Upvotes: 2