Raziel
Raziel

Reputation: 474

How to interpret Disparity value

assume we have two rectified photos with known pixel position from stereo cameras and we want to draw the disparity map

what would be the closest pixel if the pixel in the right photo is moving in both direction? I know that the farthest point is the point which has the minimum value if we do q.x -p.x (p is a pixel in the Left photo) so is the maxim value from this is the closest?

Thank you

Upvotes: 3

Views: 4275

Answers (1)

BHawk
BHawk

Reputation: 2472

Disparity maps are usually written with signed values which indicate which direction the pixel moves from one image to the other in the stereo pair. For instance if you have a pixel in the left view at location <100,250> and in the right view the corresponding pixel is at location <115,250> then the disparity map for the left view at location <100,250> would have a value of 15. The disparity map for the right view at location <115,250> would have a value of -15.

Disparity maps can be multi-channel images usually with the x-shift in the first channel and the y-shift in the second channel. If you are looking at high resolution stereo pairs with lots of disparity you might not be able to fit all possible disparity values into an 8-bit image. In the film industry most disparity maps are stored as 16 or 32 bit floating point images.

There is no standard method of scaling disparity and it is generally frowned upon since disparity is meant to describe a "physical/concrete/immutable/etc" property. However, sometimes it is necessary. For instance if you want to record disparity of a large stereo pair in an 8-bit image you will have to scale the values to fit into the 8-bit container. You can do this in many different ways.

One way to scale a disparity map is to take the largest absolute disparity value and divide all values by a factor that will reduce that value to the maximum value in your signed 8-bit world (128). This method is easy to scale back to the original disparity range using a simple multiplier but can obviously lead to a reduction in detail due to the step reduction created by the division. For example, if I have an image with a disparity range of 50 to -200 meaning I have 250 possible disparity values. I can divide all values by 200/128 = 1.5625. This gives me a range of 32 to -128 or 160 possible disparity values. When I scale those value back up using a multiply I get 50 to -200 again but now there are only 160 possible disparity values within that range.

Another method using the above disparity range is to simply shift the range. The total range is 250, our signed 8-bit container can hold 256 values so we subtract 250-128 = 72 from all values which gives us a new range of 122 to -128. This allows us to keep all of the disparity steps and get the exact input image back simply by adding our shift factor back into the image.

Conversely, if you have a disparity map with range -5 to 10. You might want to expand that range to include subpixel disparity values. So you might scale 10 up to 128 and -5 down to -64. This gives a broader range of values but the total number of possible values will change from frame to frame depending on the input disparity range.

The problem with scaling methods is that they can be lossy and each saved image will have a scaling factor/method that needs to be reversed. If each image has a separate scaling factor, then that factor has to be stored with the image. If each image has the same scaling factor then there will be a larger degradation of the data due to the reduction of possible values. This is why it is generally good practice to store disparity maps at higher bit-depths to ensure the integrity of the data.

Upvotes: 6

Related Questions