1234
1234

Reputation: 579

Understand Translation Matrix in OpenGL

Assume we want to translate a point p(1, 2, 3, w=1) with a vector v(a, b, c, w=0) to a new point p'

Note: w=0 represents a vector and w=1 represent a point in OpenGL, please correct me if I'm wrong.

In Affine transformation definition, we have:

p + v = p'

=> p(1, 2, 3, 1) + v(a, b, c, 0) = p(1 + a, 2 + b, 3 + c, 1)

=> point + vector = point (everything works as expected)

In OpenGL, the translation matrix is as following:

1 0 0 a
0 1 0 b
0 0 1 c
0 0 0 1

I assume (a, b, c, 1) is the vector from Affine transformation definition why we have w=1, but not w=0 such as

1 0 0 a
0 1 0 b
0 0 1 c
0 0 0 0

Upvotes: 3

Views: 926

Answers (2)

derhass
derhass

Reputation: 45322

Note: w=0 represents a vector and w=1 represent a point in OpenGL, please correct me if I'm wrong.

You are wrong. First of all, this hasn't really anything to do with OpenGL. This is about homogenous coordinates, which is a purely mathematical concept. It works by embedding an n-dimensional vector space into an n+1 dimensional vector space. In the 3D case, we use 4D homogenous coordinates, with the definition that the homogenous vector (x, y, z, w) represents the 3D point (x/w, y/w, z/w) in cartesian coordinates.

As a result, for any w != 0, you get a certain finite point, and for w = 0, you are discribing an infinitely far away point into a specific direction. This means that the homogenous coordinates are more powerful in the regard that they can actually describe infinitely far away points with finite coordinates (which is something which comes very handy for perspective transformations, where infinitely far away points are mapped to finite points, and vice versa).

You can, as a shortcut, imagine (x,y,z,0) as some direction vector. But for a point, it is not just w=1, but any w value unequal 0. Conceptually, this means that any cartesian 3D point is represented by a line in homogenous space (we did go up one dimension, so this actually makes sense).

I assume (a, b, c, 1) is the vector from Affine transformation definition why we have w=1, but not w=0?

Your assumption is wrong. One thing about homogenous coordinates is that we do not apply a translation in the 4D space. We get the effect of the translation in the 3D space by actually doing a shearing operation in 4D space.

So what we really want to do in homogenous space is

(x + w *a, y + w*b, z+ w*c, w)

since the 3D interpretation of the resulting vector will then be

(x + w*a) / w  == x/w + a
(y + w*b) / w  == y/w + b
(z + w*c) / w  == z/w + c

which will represent the translation that we were after.

So to try to make this even more clear:

What you wrote in your question:

p(1, 2, 3, 1) + v(a, b, c, 0) = p(1 + a, 2 + b, 3 + c, 1)

Is explicitely not what we want to do. What you describe is an affine translation with respect to the 4D vector space.

But what we actually want is a translation in the 3D cartesian coordinates, so

 (1, 2, 3) + (a, b, c) = (1 + a, 2 + b, 3 + c)

Applying your formula would actually mean doing a translation in the homogenous space, which would have the effect of doing a translation which is scaled by the w coordinate, while the formula I gave will always translate the point by (a,b,c), no matter what w we chose for the point.

This is of course not true if we chose w=0. Then, we will get no change at all, which is also correct because a translation will never change directions - your formula would change the direction. Your formula is correct only for w=1, which is aonly a special case. But the key point here is that we are not doing a vector addition after all, but a matrix * vector multiplication. And homogenous coordinates just allow us (among other, more powerful things), to represent a translation via matrix multiplication. But this does not mean that we can just interpret the last column as a translation vector as if we did vector addition.

Upvotes: 4

BDL
BDL

Reputation: 22167

Simple Answer

The reason is the way how matrix multiplications work. If you multiply a matrix by a vector then the w-component of the result is the inner product of the 4th line of the matrix with the vector. After applying the transformation, a point should still be a point and a direction should be a direction. If you would set that to a 0-vector, the result will always be 0 and thus, the resulting vector will have changed from position (w=1) to direction (w=0).

More detailed answer

The definition of a affine transformation is:

x' = A * x + t,

where is a A is a linear map and t a translation. Traditionally, linear maps are written by mathematicians in matrix form. Note, that t is here, similar to x, a 3-dimensional vector. It would now be cumbersome (and less general, thinking of projective mappings), if we would always have to handle the linear mapping matrix and the translation vector. This can be solved by introducing an additional dimension to the mapping, the so-called homogeneous coordinate, which allows us to store the linear mapping as well as the translation vector in a combined 4x4 matrix. This is called augmented matrix and by definition,

  x'      A | t       x 
[   ] = [   |   ] * [   ]
  1       0 | 1       1

It should also be noted, that affine transformations can now be combined very easily by just multiplying there augmented matrices, which would be hard to do in matrix plus vector notation.

One should also note, that the bottom-right 1 is not part of the translation vector, which is still 3-dimensional, but of the matrix augmentation.

You might also want to read the section about "Augmented matrix" here: https://en.wikipedia.org/wiki/Affine_transformation#Augmented_matrix

Upvotes: 3

Related Questions