I have a webcam looking down on a surface which rotates about a single-axis. I'd like to be able to measure the rotation angle of the surface. The camera position and the rotation axis of the surface are both fixed. The surface is a distinct solid color right now, but I do have the option to draw features on the surface if it would help. Here's an animation of the surface moving through its full range, showing the different apparent shapes: My approach thus far: Record a series of "calibration" images, where the surface is at a known angle in each image Threshold each image to isolate the surface. Find the four corners with cv2.approxPolyDP(). I iterate through various epsilon values until I find one that yields exactly 4 points. Order the points consistently (top-left, top-right, bottom-right, bottom-left) Compute the angles between each points with atan2. Use the angles to fit a sklearn linear_model.linearRegression() This approach is getting me predictions within about 10% of actual with only 3 training images (covering full positive, full negative, and middle position). I'm pretty new to both opencv and sklearn; is there anything I should consider doing differently to improve the accuracy of my predictions? (Probably increasing the number of training images is a big one??) I did experiment with cv2.moments directly as my model features, and then some values derived from the moments , but these did not perform as well as the angles. I also tried using a RidgeCV model, but it seemed to perform about the same as the linear model.

Reputation: 680

Method to determine polygon surface rotation from top-down camera

I have a webcam looking down on a surface which rotates about a single-axis. I'd like to be able to measure the rotation angle of the surface.

The camera position and the rotation axis of the surface are both fixed. The surface is a distinct solid color right now, but I do have the option to draw features on the surface if it would help.

Here's an animation of the surface moving through its full range, showing the different apparent shapes:

My approach thus far:

Record a series of "calibration" images, where the surface is at a known angle in each image
Threshold each image to isolate the surface.
Find the four corners with cv2.approxPolyDP(). I iterate through various epsilon values until I find one that yields exactly 4 points.
Order the points consistently (top-left, top-right, bottom-right, bottom-left)
Compute the angles between each points with atan2.
Use the angles to fit a sklearn linear_model.linearRegression()

This approach is getting me predictions within about 10% of actual with only 3 training images (covering full positive, full negative, and middle position). I'm pretty new to both opencv and sklearn; is there anything I should consider doing differently to improve the accuracy of my predictions? (Probably increasing the number of training images is a big one??)

I did experiment with cv2.moments directly as my model features, and then some values derived from the moments, but these did not perform as well as the angles. I also tried using a RidgeCV model, but it seemed to perform about the same as the linear model.

Upvotes: 7

Answers (3)

Christian

Reputation: 1257

Another option that is rather easy to implement, especially since you've done a part of the job is the following (I've used it to compute the orientation of a cylindrical part from 3 images acquired when the tube was rotating) :

Threshold each image to isolate the surface.
Find the four corners with cv2.approxPolyDP(), alternatively you could find the four sides of your part with LineSegmentDetector (available from OpenCV 3).
Compute the angle alpha, as depicted on the image hereunder

When your part is rotating, this angle alpha will follow a sine curve. That is, you will measure alpha(theta) = A sin(theta + B) + C. Given alpha you want to know theta, but first you need to determine A, B and C.

You've acquired many "calibration" or reference images, you can use all of these to fit a sine curve and determine A, B and C.
Once this is done, you can determine theta from alpha.

Notice that you have to deal with sin(a+Pi/2) = sin(a). It is not a problem if you acquire more than one image sequentially, if you have a single static image, you have to use an extra mechanism.

Hope I'm clear enough, the implementation really shouldn't be a problem given what you have done already.

Upvotes: 1

Gopiraj

Reputation: 657

If I'm clear, you want to estimate the Rotation of the polygon with respect to the camera. If you know the length of the object in 3D, you can use solvePnP to estimate the pose of the object, from which you can get the Rotation of the object.

Steps:

Calibrate your webcam and get the intrinsic matrix and distortion matrix.
Get the 3D measurements of the object corners and find the corresponding points in 2d. Let me assume a rectangular planar object and the corners in 3d will be (0,0,0), (0, 100, 0), (100, 100, 0), (100, 0, 0).
Use solvePnP to get the rotation and translation of the object

The rotation will be the rotation of your object along the axis. Here you can find an example to estimate the pose of the head, you can modify it to suit your application

Upvotes: 4

one_observation

Reputation: 464

Your first step is good -- everything after that becomes way way way more complicated than necessary (if I understand correctly).

Don't think of it as 'learning,' just think of it as a reference. Every time you're in a particular position where you DON'T know the angle, take a picture, and find the reference picture that looks most like it. Guess it's THAT angle. You're done! (They may well be indeterminacies, maybe the relationship isn't bijective, but that's where I'd start.)

You can consider this a 'nearest-neighbor classifier,' if you want, but that's just to make it sound better. Measure a simple distance (Euclidean! Why not!) between the uncertain picture, and all the reference pictures -- meaning, between the raw image vectors, nothing fancy -- and choose the angle that corresponds to the minimum distance between observed, and known.

If this isn't working -- and maybe, do this anyway -- stop throwing away so much information! You're stripping things down, then trying to re-estimate them, propagating error all over the place for no obvious (to me) benefit. So when you do a nearest neighbor, reference pictures and all that, why not just use the full picture? (Maybe other elements will change in it? That's a more complicated question, but basically, throw away as little as possible -- it should all be useful in, later, accurately choosing your 'nearest neighbor.')

Upvotes: 1

Method to determine polygon surface rotation from top-down camera

Answers (3)

Related Questions