PsP
PsP

Reputation: 696

pixel correspondence in optical flow

I am somehow new to the concept of optical flow in video sequences and I've read the basics about optical flow and I'm familiar with Horn & Shunck method or Lucas & Kanade's method.

I realized that in these methods we are calculating some vectors which represent the movements of pixels in an image, of course there are some constraints about these pixels (brightness constancy, smoothness and ....)

My question:

According to the formula fx*u + fy *v = -ft how can we exactly establish a correspondence between one pixel in frame t , to another pixel in frame t + 1?

I mean how can we be sure that it is the same pixel in frame t which we have found in frame t + 1, I don't know in which part of their algorithm we are finding these pixels and establishing a correspondence between pixels in frame t and frame t + 1! I know that we can find the pixels which have moved, but I don't know how did we find the relations between the pixels in frame t and frame t + 1.

I hope that you understand my question :o)(o:

If it's possible answer as formulated as you can.

Merci beaucoup

Upvotes: 2

Views: 2064

Answers (3)

koshy george
koshy george

Reputation: 680

In Horn and Schunk's method there is no need to compute pixel correspondence across two frames by extraneous methods. H&S is an iterative algorithm. For two consecutive frames, you start with some initial values for u-s and v-s and iterate till it converges.

detail:

For two consecutive frames, you do several iterations of the following, (This is computed for every pixel, Imagine having a u-image-buffer and a v-image-buffer)

u = u_av - Fx *(P/D)
v = v_av - Fy *(P/D)

where

*, stands for multiplication
P = Fx * u_av + Fy  * v_av + Ft
D = lambda + Fx**2 + Fy**2
Fx = gradient of image along x (can be averaged across the two frames)
Fy = gradient of image along y (can be averaged across the two frames)
Ft = temporal gradient across two frames
u_av = (sum of u-s of 4 diagonal neighbors)/4
v_av = (sum of u-s of 4 diagonal neighbors)/4
lambda=smoothness constraint coefficient

The initial values of u and v can be zeros.

Upvotes: 0

Tobias Senst
Tobias Senst

Reputation: 2830

Actually the methods of Horn, Schunk and Lucas, Kanade deal in different ways with the equation:

Fx*U + Fy*V = -Ft

As you see this equation is an underdetermined system of equations. So Horn and Schunk proposed to integrate a secound assumption. The smoothness constrain that the deviation of U and V should be small. This is integrated into a least square framework where you have:

(Fx*U + Fy*V + Ft)² + lambda * (gradient(U)² + gradient(V)² = E
E -> min

with that equation it is possible to solve U and V by setting the deviation of E to 0. Consequently the solutions of the motion vectors are connected via the gradient operator of U and V.

Lucas and Kanade proposed to that in a defined region the Lucas Kanade window only one motion vector is computed (or a region has the only one motion / motion constancy constrain) and put it into a least square framework:

sum(Fx*U + Fy*V + Ft)² = E
E->min

The summation is done for each pixel in the defined region. And U and V could also be easily computed be the deviation of E set to 0.

With these two equations you see that the pixel correspondences are found by using the temporal (Ft) and spatial image gradients (Fx, Fy). There is a nice picture in the origin Lucas and Kanade paper, that shows this correlation graphically. However there are some points to consider:

  • these kind of methods are only able to compute motion vectors, if the image contains texture (aperture problem)
  • Fx*U + Fy*V + Ft is a first order Taylor approximation of F(x, y, t) = F(x + U, y + V, t + 1). That means your image signal needs to be linear. In consequence you are just able to compute motions up to a few pixels. That's why image pyramids are used to deliver linearity.
  • Motion constancy or smoothness constrains prevent sharp motion boundaries. This could be important in some application.
  • The framework does not prevent you from classical correspondence problem.

Upvotes: 6

rotating_image
rotating_image

Reputation: 3086

with Fx*U + Fy*V = -Ft we cannot solve this equation for one pixel...so after cvGoodFeaturesToTrack gives you a set of pixels...a window is chosen around each pixel in that set...according to the assumption of constant intensity that patch/window(centering that chosen pixel) is supposed to have the same intensity in the next frame. So suppose in frameA we find out the U and V for a point by considering a window around it...U and V gives the displacement in pixels that the particular point is supposed to go through in horizontal and vertical direction...using U and V we find the position of the point in next frame i.e. frameB. According to constant intensity assumption the patch around the predicted point in frameB should have the same intensity as the patch around the point in frameA had...after checking the intensity between two patches in frameA and frameB it is determined weather the point has gone under a good track or not..I have tried to explain as much i could...correct me if I am wrong at some point..

Upvotes: 1

Related Questions