Explanation of the Perspective Projection Matrix (Second row)

Question

I try to figure out how the Perspective Projection Matrix works.

According to this: https://www.opengl.org/sdk/docs/man2/xhtml/gluPerspective.xml

f = cotangent(fovy/2)

Logically I understand how it works (x- and y-Values moving further away from the bounding box or vice versa), but I need an mathematical explanation why this works. Maybe because of the theorem of intersecting lines???

I found an explanation here: http://www.songho.ca/opengl/gl_projectionmatrix.html But I don't understand the relevent part of it.

Pidhorskyi · Accepted Answer

As for me, an explanation of the perspective projection matrix at songho.ca is the best one. I'll try to retell the main idea, without going into details. But, first of all, let's clarify why the cotangent is used in OpenGL docs.

What is cotangent? Accordingly to wikipedia:

The cotangent of an angle is the ratio of the length of the adjacent side to the length of the opposite side.

Look at the picture below, the near is the length of the adjacent side and the top is the length of the opposite side . The fov/2 is the angle we are interested in. The angle fov is the angle between the top plane and bottom plane, respectively the angle fov/2 is the angle between top(or botton) plane and the symmetry axis.

enter image description here

So, the [1,1] element of projection matrix that is defined as cotangent(fovy/2) in opengl docs is equivalent to the ratio near/top.

Let's have a look at the point A specified at the picture. Let's find the y' coordinate of the point A' that is a projection of the point A on the near plane.

Using the ratio of similar triangles, the following relation can be inferred:

y' / near = y / -z

Or:

y' = near * y / -z

The y coordinate in normalized device coordinates can be obtained by dividing by the value top (the range (-top, top) is mapped to the range (-1.0,1.0)), so:

yndc = near / top * y / -z

The coefficient near / top is a constant, but what about z? There is one very important detail about normalized device coordinates. The output of the vertex shader is a four component vector, that is transformed to three component vector in the interpolator by dividing first three component by the fourth component:

enter image description here ,

So, we can assign to the fourth component the value of -z. It can be done by assigning to the element [2,3] of the projection matrix the value -1.

Similar reasoning can be done for the x coordinate.

We have found the following elements of projection matrix:

| near / right      0               0           0 | 
| 0                 near / top      0           0 |
| 0                 0               ?           ? |
| 0                 0               -1          0 |

There are two elements that we didn't found, they are marked with '?'.

To make things clear, let's project an arbitary point (x,y,z) to normalized device coordinates:

| near / right      0               0           0 |   | x |
| 0                 near / top      0           0 | X | y | = 
| 0                 0               ?           ? |   | z |
| 0                 0               -1          0 |   | 1 |

  | near / right * x |
= | near / top * y   |
  | ?                |
  | -z               |

And finally, after dividing by the w component we will get:

| - near / right * x / z |
| - near / top * y  / z  |
| ?                      |

Note, that the result matches the equation inferred earlier.

As for the third component that marked with '?'. More complex reasoning is needed to find out how to calculate it. Refer to the songho.ca for more information.

I hope that my explanations make things a bit more clear.

Explanation of the Perspective Projection Matrix (Second row)

Answers (1)

Related Questions