motion reconstruction from a single camera

Question

I have a single calibrated camera (known intrinsic parameters, i.e. camera matrix K is known, as well as the distortion coefficients).

I would like to reconstruct the camera's 3d trajectory. There is no a-priori knowledge about the scene.

simplifying the problem by presenting two images that look on the same scene and extracting two set of corresponding matched feature points from them (SIFT, SURF, ORB, etc.) My problem is how can I calculate the camera extrinsic parameters (i.e. the rotation matrix R and the translation vector t ) between the to viewpoints?

I have managed to calculate the fundamental matrix, and since K is know, the essential matrix as well. using David Nister's efficient solution to the Five-Point Relative Pose Problem I've managed to get 4 possible solution but:

the constraint on the essential matrix E ~ U * diag (s,s,0) * V' doesn't always apply - causing incorrect results. [EDIT]: taking the average singular value seems to correct the results :) one down
how can I tell which one of the four is the correct one?

Thanks

beaker · Accepted Answer

Your solution to point 1 is correct: diag( (s1 + s2)/2, (s1 + s2)/2, 0).

As for telling which one of the four solutions is correct, only one will give positive depths for all points with respect to the camera frame. That's the one you want.

Code for checking which solution is correct can be found here: http://cs.gmu.edu/%7Ekosecka/examples-code/essentialDiscrete.m from http://cs.gmu.edu/%7Ekosecka/bookcode.html They use the determinants of U and V to determine the solution with the correct orientation. Look for the comment "then four possibilities are". Since you're only estimating the essential matrix, it's susceptible to noise and does not behave well at all if all of the points are coplanar.

Also, the translation is only recovered to within a constant scaling factor, so the fact that you're seeing a normalized translation vector of unit magnitude is exactly correct. The reason is that the depth is unknown and estimated to be 1. You'll have to find some way to recover the depth as in the code for the eight-point algorithm + 3d reconstruction (Algorithm 5.1 in the bookcode link.)

The book the sample code above is taken from is also a very good reference. http://vision.ucla.edu/MASKS/ Chapter 5, the one you're interested in, is available on the Sample Chapters link.

motion reconstruction from a single camera

Answers (2)

Related Questions