Reputation: 133
I'm currently working on a project in which I have to estimate 3D coordinates of 2D interest points detected using a monocular camera.
To be more precise, I have in input an image sequence (calibrated) and it is required, when receiving a new image, to triangulate points between the left (previous) image and the right current one to get 3D points.
To do this, I'm following these steps:
The resulting 3D points are not correct when I reproject them on the images. But, I have read that the triangulated points are defined to only up to an indeterminant scale factor.
So my question is: What does "up to scale" means in this context? And what is the solution to get the real 3D points in the scene's world coordinate frame?
I would be thankful for any help!
Upvotes: 6
Views: 3387
Reputation: 11785
You are likely to have a bug or a poorly estimated essential matrix. The unknown scale factor cannot be responsible for the reconstruction errors you see. Regardless of global scale, the result of projecting onto the image pair a 3d point estimated from good matches and with a valid essential matrix should be consistent.
The meaning of "up to scale" in this context is that, even with intrinsically calibrated cameras, the standard method of estimating the essential matrix yields the same result if you replace your scene with one in which everything is larger or smaller by the same amount. You can remove this ambiguity either
Upvotes: 3
Reputation: 670
My experience working with Structure from Motion that extracting a Point Cloud by using the Fundamental Matrx/Essential Matrix, leads to an different Point Cloud size. How I understand is it because the fundamental Matrx/essential Matrix is valid if
x1^t * F * x2 = 0.
This equation is also valid if the scale of F changes. So we don't have scale invariance.
What worked really well is to extract the new camera positions from the exisiting point cloud which was computed from old image pairs before. For that you have remember the 2D 3D correspondenc of images before. It is called PerspectivenPoint Camera Pose Estimation (PnP). OpenCV has some Methods for that.
Here is some conclusion for structure from motion:
Structure from Motion, Reconstruct the 3D Point Cloud given 2D Image points correspondence
Upvotes: 0