Find relative scale in monocular Visual Odometry without PnP

Question

I am implementing a standard VO algorithm, with a few changes, i.e. extract features, match the feature points, find the essential matrix and decompose to get the pose. After initialization however, instead of using 3D-2D motion estimation (PNP) for subsequent frames, I'm using the same 2D-2D motion estimation (using essential matrix). I find that 2D-2D estimation seems a lot more accurate than 3D-2D. To find the relative scale of the second pose with respect to the first, I can find out the common points (that were triangulated for both frame pairs). According the Visual Odometry Tutorial, Scaramuzza, one can find the relative scale by finding the ratio of relative distances between common point pairs.

If f13D and f23D are the triangulated 3D points from subsequent framepairs, I choose point pairs in random and compute the distances, here is a rough code snippet for the same.

indices = np.random.choice(np.arange(0,len(f23D)), size=(5 * len(f23D),2),replace=True)
indices = indices[indices[...,0]!=indices[...,1]]
num = np.linalg.norm(f13D[indices[...,0]] - f13D[indices[...,1]], axis=1).reshape((len(indices),1))
den = np.linalg.norm(f23D[indices[...,0]] - f23D[indices[...,1]], axis=1).reshape((len(indices),1))
return np.median(num/den).

I have also tried replacing the last line with a linear ransac estimator. However, since scale triangulation is not perfect, these values are extremely noisy and thus the scale estimate also varies significantly, on using different numpy seeds.

Is this the right way to implement relative scale in monocular VO as described in the article? If not, what is the best way to do it (I do not wish to use PNP since rotation seems to be less accurate)

Find relative scale in monocular Visual Odometry without PnP

Answers (0)

Related Questions