May Oakes
May Oakes

Reputation: 4619

How do you calculate the transformation matrix for a camera?

I have a camera with the position, forward, right, and down vectors defined as class members _position, _forward, _right, and _down. For the rotation part of the problem, I simply load the rows of the destination coordinate system (view space) with _forward, _right, and _down and then apply a rotation matrix to go into "OpenGL" view space.

Before that I translate by the negative of _position. Is this correct, or do I need to do more math to determine the actual translation to pre- or post- multiply by my rotation?

Below is the code I have that doesn't seem to work. Specifically, I am rendering an object (a small quad facing down the z-axis) at the origin of world space and when I run the program that object appears distorted. (Note that the mat4 constructor takes elements in ROW major order though it stores them in column major internally).

mat4 Camera::matrix() const
{
    return
        mat4(
            0.0f, 0.0f, -1.0f, 0.0f,
            1.0f, 0.0f, 0.0f, 0.0f,
            0.0f, -1.0f, 0.0f, 0.0f,
            0.0f, 0.0f, 0.0f, 1.0f) *
        mat4(
            _forward.x, _forward.y, _forward.z, 0.0f,
            _right.x, _right.y, _right.z, 0.0f,
            _down.x, _down.y, _down.z, 0.0f,
            0.0f, 0.0f, 0.0f, 1.0f) *
        translate(
            -_position.x,
            -_position.y,
            -_position.z
        );
}

Here is the code for the look at function that is being used before the above.

void Camera::look_at(const vec3& position, const vec3& point)
{
    _position = position;
    _forward = normalize(point - position);
    _right = cross(_forward, vec3(0.0f, 1.0f, 0.0f));
    _down = cross(_forward, _right);
}

In my initialization, I use this code:

camera.look_at(vec3(1.0f, 1.0f, 1.0f), vec3(0.0f, 0.0f, 0.0f));
view = camera.matrix();
proj = perspective(60.0f, SCREEN_WIDTH / (float)SCREEN_HEIGHT, 0.001f, 100.0f);
view_proj = proj * view;

Upvotes: 1

Views: 3208

Answers (2)

May Oakes
May Oakes

Reputation: 4619

The problem of calculating the view matrix (the answer to the specific question "How do you calculate the transformation matrix for a camera?") is the problem of transforming vectors and points from world space to view space. This is the same as transforming from any source space to a destination space. (For the rest of this answer, source space refers to world space and destination space refers to view space.) We are given the destination basis vectors (the rotation vectors) relative to source (the _right x, _up y, and _backward z camera vectors as vec3s) and the destination space's origin relative to source space (the _position point as a vec3). While the question uses _forward, _right, and _down, first we will look at _right, _up, and _backward to stay consistent with OpenGL view space. At the end we will look at the solution using the former vectors. In addition, the term space will refer to a a coordinate system basis (orientation vector) and and a frame origin (position vector).

The transformation matrix from a source space to a destination space can be formulated as a rotation so that all vectors and points are oriented correctly with respect to destination space, and then a translation so that all points are positioned correctly with respect to destination space. Any transformation matrix times a point is always a rotation (a linear combination of the rows of the matrix and the vector) and a translation (offsetting the result of the rotation by the fourth column vector). Note that a direction vector (denoted in homogeneous coordinates by the fourth component of the vector being set to zero) is not affected by the translation. We will now look at two methods to derive such a matrix, where the second method is computationally more efficient.

Let the offset vector be the vector that points the entire distance from the destination space to the source space. The first method, which is used in the code snippet provided by the question, is to first translate by the offset vector relative to source space. This operation repositions all points (but not vectors) to have the destination origin as their reference point. Second, a rotation is applied by loading into the first three columns the basis vectors of destination space relative to source space as rows (i.e. _right, _up, and _backward). Why rows? Vectors with respect to any coordinate system simply represent a linear combination of the underlying basis vectors of the coordinate system, and the result of a 4x4 matrix times a 4-component vector is simply a linear combination of the rows of the first matrix times the columns of the second matrix (in the case of a column vector, this is simply a 4x1 matrix). That linear combination is precisely what we're after and why homogeneous coordinates work. The following is sample code that implements this method. Note that the translate() function returns a translation matrix.

mat4 Camera::matrix() const
{
    return
        mat4(
               _right.x,    _right.y,    _right.z, 0.0f,
                  _up.x,       _up.y,       _up.z, 0.0f,
            _backward.x, _backward.y, _backward.z, 0.0f,
                   0.0f,        0.0f,        0.0f, 1.0f) *
        translate(
            -_position.x,
            -_position.y,
            -_position.z
        );
}

The second, more efficient method is to, instead of multiply two matrices, generate the matrix directly and avoid the need to multiply a rotation matrix and a translation matrix. First, as before, the first three columns are loaded with the destination basis vectors as rows. Finally, the translation vector (the first three elements in the fourth column) is calculated using the dot product of each of the basis vectors.

mat4 Camera::matrix() const
{
    return mat4(
           _right.x,    _right.y,    _right.z, -dot(   _right, _position),
              _up.x,       _up.y,       _up.z, -dot(      _up, _position),
        _backward.x, _backward.y, _backward.z, -dot(_backward, _position),
               0.0f,        0.0f,        0.0f,                      1.0f);
}

Note that the dot products should be dot(_right, -_position) etc., but we can save multiplications by moving the negation outside of the dot product to be -dot(_right, _position).

These dot products are exactly the same as multiplying the rotation matrix by the translation matrix previously. For example, the following matrix multiplication demonstrates the situation in the first method. Compare it to the code above with the dot products.

| x0 x1 x2 0 |   | 1 0 0 -t0 |   | x0 x1 x2 -(x0*t0 + x1*t1 + x2*t2) |
| y0 y1 y2 0 |   | 0 1 0 -t1 |   | y0 y1 y2 -(y0*t0 + y1*t1 + y2*t2) |
| z0 z1 z2 0 | * | 0 0 1 -t2 | = | z0 z1 z2 -(z0*t0 + z1*t1 + z2*t2) |
|  0  0  0 1 |   | 0 0 0   1 |   |  0  0  0                       1  |

Note that at all times, the _right, _up, and _backward vectors (or _forward, _right, and _down vectors as in the question) must be normalized or else some unintentional scaling will result.

To account for using _forward, _right, and _down vectors an additional rotation needs to be applied to get to OpenGL view space. This matrix is listed below:

|  0  1  0  0 |
|  0  0 -1  0 |
| -1  0  0  0 |
|  0  0  0  1 |

The rows in the upper left 3x3 submatrix are the basis vectors of the OpenGL view space (right x, up y, backward z) relative to the desired camera space (forward x, right y, down z). We then multiply this matrix with the matrix from before with the dot products. Such a matrix multiplication is similar to:

|  0  1  0  0 |   | x0 x1 x2 t0 |   |  y0  y1  y2  y3 |
|  0  0 -1  0 |   | y0 y1 y2 t1 |   | -z0 -z1 -z2 -t2 |
| -1  0  0  0 | * | z0 z1 z2 t2 | = | -x0 -x1 -x2 -t0 |
|  0  0  0  1 |   |  0  0  0  1 |   |   0   0   0   1 |

The final matrix function could then be:

mat4 Camera::matrix() const
{
    float tx =  dot(_forward, _position);
    float ty = -dot(_right, _position);
    float tz =  dot(_down, _position);

    return mat4(
           _right.x,    _right.y,    _right.z,   ty,
           -_down.x,    -_down.y,    -_down.z,   tz,
        -_forward.x, -_forward.y, -_forward.z,   tx,
               0.0f,        0.0f,        0.0f, 1.0f);
}

So it appears besides some performance issues, all the code in the question is correct except for the fact that the vectors are not normalized in the look_at() function.

Upvotes: 3

Harish
Harish

Reputation: 974

I'd suggest to use glm::lookAt() for convenience.

For more info refer https://glm.g-truc.net/0.9.2/api/a00245.html#ga2d6b6c381f047ea4d9ca4145fed9edd5

But here is how one would construct it from position, target and up vectors.

note this is same as the lookAt() function.

//creates a lookat matrix
mat4 lookAt(Vec3 eye, Vec3 center, Vec3 up)
{
    Vec3 left, up2, forward;

    // make rotation matrix

    // forward vector
    forward = center - eye;
    forward.Normalize();

    // up2 vector (assuming up is normalized)
    up2 = up;

    // left vector = up2 cross forward
    left = up2.Cross(forward);

    // Recompute up2 = forward cross left (in case up and forward were not orthogonal)
    up2 = forward.Cross(left);

    // cross product gives area of parallelogram, which is < 1.0 for
    // non-perpendicular unit-length vectors; so normalize left, up2 here
    left.Normalize();
    up2.Normalize();

     mat4(
        left.x, left.y, left.z, left.Dot(-eye),
        up2.x, up2.y, up2.z, up2.Dot(-eye),
        -forward.x, -forward.y, -forward.z, -forward.Dot(-eye),
        0.0f, 0.0f, 0.0f, 1.0f)

}

note that up2 = up in case up and forward were orthogonal, else up2 is an orthogonal up vector to forward.

Upvotes: 2

Related Questions