Reputation: 378
I am asking this questions as a trimmed version of my previous question. Now that I have a face looking some position on screen and also gaze coordinates (pitch and yaw) of both the eye. Let us say
Left_Eye = [-0.06222888 -0.06577308]
Right_Eye = [-0.04176027 -0.44416167]
I want to identify the screen coordinates where the person probably may be looking at? Is this possible? Please help!
Upvotes: 1
Views: 4008
Reputation: 51903
What you need is:
3D position and direction for each eye
you claim you got it but pitch and yaw are just Euler angles and you need also some reference frame and order of transforms to convert them back into 3D vector. Its better to leave the direction in a vector form (which I suspect you got in the first place). Along with the direction you need th position in 3D in the same coordinate system too...
3D definition of your projection plane
so you need at least start position and 2 basis vectors defining your planar rectangle. Much better is to use 4x4 homogenous transform matrix for this because that allows very easy transform from and in to its local coordinate system...
So I see it like this:
So now its just matter of finding the intersection between rays and plane
P(s) = R0 + s*R
P(t) = L0 + t*L
P(u,v) = P0 + u*U +v*V
Solving this system will lead to acquiring u,v
which is also the 2D coordinate inside your plane yo are looking at. Of course because of inaccuracies this will not be solvable algebraicaly. So its better to convert the rays into plane local coordinates and just computing the point on each ray with w=0.0
(making this a simple linear equation with single unknown) and computing average position between one for left eye and the other for right eye (in case they do not align perfectly).
so If R0',R',L0',L'
are the converted values in UVW local coordinates then:
R0z' + s*Rz' = 0.0
s = -R0z'/Rz'
// so...
R1 = R0' - R'*R0z'/Rz'
L1 = L0' - L'*L0z'/Lz'
P = 0.5 * (R1 + L1)
Where P
is the point you are looking at in the UVW coordinates...
The conversion is done easily according to your notations you either multiply the inverse or direct matrix representing the plane by (R,1),(L,1),(R0,0)(L0,0)
. The forth coordinate (0,1
) just tells if you are transforming vector or point.
Without knowing more about your coordinate systems, data accuracy, and what knowns and unknowns you got is hard to be more specific than this.
If your plane is the camera projection plane than U,V
are the x and y axis of the image taken from camera and W is normal to it (direction is just matter of notation).
As you are using camera input which uses a perspective projection I hope your positions and vectors are corrected for it.
Upvotes: 5