2011-01-09

What matrix gluLookAt generates? (3)

gluLookAt implementation in python

Here is my gluLookAt matrix computation code by python (numpy) and its test code. This code cares rather comprehensiveness than efficiency, therefore, there are matrix multiplication and matrix transpose. You can remove this if you want to have more efficient code. The test compares the result of this implementation and the OpenGL's computation result with less than 1.0e-6 error. As far as I tested, the result is in the error threshold in Ubuntu 10.04, python 2.6, glx version 1.4

-----
# Copyright (C) 2010-2011 H. Yamauchi
# under New (3-clause) BSD license

#
# get lookat matrix (gluLookAt compatible matrix) python, numpy
#
# \param[in]  _eye    eye point
# \param[in]  _lookat lookat point
# \param[in]  _up     up vector
# \return 4x4 gluLookAt matrix
def getLookAtMatrix(_eye, _lookat, _up):
  ez = _eye - _lookat
  ez = ez / numpy.linalg.norm(ez)

  ex = numpy.cross(_up, ez)
  ex = ex / numpy.linalg.norm(ex)

  ey = numpy.cross(ez, ex)
  ey = ey / numpy.linalg.norm(ey)

  rmat = numpy.eye(4)
  rmat[0][0] = ex[0]
  rmat[0][1] = ex[1]
  rmat[0][2] = ex[2]

  rmat[1][0] = ey[0]
  rmat[1][1] = ey[1]
  rmat[1][2] = ey[2]

  rmat[2][0] = ez[0]
  rmat[2][1] = ez[1]
  rmat[2][2] = ez[2]

  tmat = numpy.eye(4)
  tmat[0][3] = -_eye[0]
  tmat[1][3] = -_eye[1]
  tmat[2][3] = -_eye[2]

  # numpy.array * is element-wise multiplication, use dot()
  lookatmat = numpy.dot(rmat, tmat).transpose()

  return lookatmat


#
# test getLookAtMatrix routine
#
# generate two matrices, glmat by gluLookAt, mat by getLookAtMatrix.
# Then compare them. (To run this test, You need OpenGL bindings and
# also your Camera implementation that provides eye, lookat, up.)
#
def test_gluLookAt_matrix():

    GL.glLoadIdentity()

    # This is your camera. It tells eye, lookat, up.
    [ep, at, up] = gl_camera.get_lookat()
    GLU.gluLookAt(ep[0], ep[1], ep[2],
                  at[0], at[1], at[2],
                  up[0], up[1], up[2])

    # OpenGL gluLookAt matrix
    glmat = None
    glmat = GL.glGetDoublev(GL.GL_MODELVIEW_MATRIX)

    # my matrix
    mat   = getLookAtMatrix(ep, at, up)

    # debug output
    print 'glmat'
    print glmat
    print 'mat'
    print mat

    # compare glmat and mat
    for i in range(4):
        for j in range(4):
            if(math.fabs(glmat[i][j] - mat[i][j]) > 1.0e-6):
                raise StandardError ('matrix does not match.')
---

What matrix gluLookAt generates? (2) Why Gimbal lock happens?

gluLookAt matrix (why Gimbal lock happens?) 

The camera posture is represented by a rotation matrix, and camera position is represented by a translation matrix. Because in OpenGL, the default camera position and posture are defined and we move this camera around in the program. The camera is at origin, look into the Z axis minus direction, and up direction is Y axis plus direction. A rotation matrix and a translation matrix look like the following.

where,


  • x: camera basis X axis, ``right'' in the figure, ``side'' in the Mesa program in the last blog entry
  • y: camera basis Y axis, ``up'' in the figure, ``up'' in the Mesa program in the last blog entry
  • z: camera basis z axis, ``Z'' in the figure, ``-forward'' in the Mesa program in the last blog entry
  • e: camera position, ``eye'' in the figure, ``eyex, eyey, eyez'' in the Mesa program in the last blog entry


In the Mesa program, eye, lookat,up are given to the gluLookAt function, they are computed as the following:

  • z = normalize(eye - lookat)
  • x = normalize(cross(up,z))
  • y = normalize(cross(z, x))

Where, ``cross'' is a cross product function, ``normalize'' is a vector normalization function. Please note the computation order, it is z, x, y, instead of x, y, z.

By the way, you might notice that the z is -view direction. This is the minus direction for depth of OpenGL. The plus direction is not the camera's viewing direction, but, it is the behind direction. Moreover, the depth of object is plus-large, it is far from the camera, but in the world coordinate, it is minus distance in default. The coordinate is just a representation and you can choose any, so this is not wrong. For example, viewport Y plus is up direction, but, the screen coordinate is Y plus in down direction. This difference between world coordinates and camera's coordinates usually confuses me.

Let me explain a bit about the matrix R. Why this is a rotation matrix? If you see how x, y, z are constructed, these are all perpendicular and normalized, all are length 1 vector. If you change neither any length, but change only the direction of an object, this is rotation. For me, it is more intuitive if I think this matrix as an coordinate transformation. Because if someone said ``rotation,'' I imagine a rotation axis and a rotation angle. But I hardly see them in the matrix construction calculation. I can interpret the matrix R as a coordinate transformation, since I can apply this matrix R to the standard basis as the following:


As you see, [1 0 0 0]^T becomes x, [0 1 0 0]^T becomes y, [0 0 1 0]^T becomes z. Moreover, [2 0 0 0]^T becomes 2x, means each axis length is preserved. Therefore, I can easily see the coordinate transformation in the matrix R. This matrix never changes length, only the direction is changed. This is the same as rotation of rigid body.

The rest operation T is the transformation, movement of the camera position. You may notice the matrix TR has dot products at the bottom row. Because a transformation from an axis to axis is a projection as shown in Figure 2. Therefore, you need the cos value, this is a dot product.


Figure 2: Basis transformation and the relationship with dot product

In this article, I interpreted the Mesa's gluLookaAt code, what this matrix means. When you write a program to move a camera, you need to avoid the gimbal lock effect. Gimbal lock usually happens when view direction and up vector are very close. To avoid this, you move camera as a rigid body like this matrix does. Figure 3 (b) shows an implementation that only change the view direction, but not change the camera direction. In reality, you break the lens or crash the camera to make the Gimbal lock effect. To avoid this, you can rotate the camera itself like in Figure 3 (c).


Figure 3: Gimbal lock camerta. When you want to see a bit upper direction, (b) bending the lens implementation, this causes the gimbal lock, (c) rotate the camera, no gimbal lock

Next time, I will show you my gluLookAt implementation by python. As I mentioned in the motivation, this is (only) useful if you want to write a renderer that is not depends on OpenGL, but use with OpenGL.

2011-01-07

What matrix gluLookAt generates? (1)

gluLookAt function

I use OpenGL API to draw a 3D scene. This API is often used in games. If you only use OpenGL world, you usually don't need to know what kind of matrices are created underneath the API. However, if you want to overlay the OpenGL scene, for instance, if you write a own renderer and still want to render the scene with OpenGL and own, it is sometimes useful to know what kind of matrices are computed in the OpenGL API.

Therefore, this blog entry is quite specific. If you are not interested in such story, see you next time.

I would like to talk about a matrix that is computed by gluLookAt function.

gluLookAt puts a camera with a position and camera posture, e.g., which is up direction. Recently, I need to know how to generate this matrix by my own, however, I could not find a good page with a good explanation. At the end, I looked into the source code of Mesa 7.5.1 (Mesa-7.5.1/src/glu/sgi/libutil/project.c), OpenGL software implementation.

The code is quite readable and has useful comments. Here is the Mesa 7.5.1's gluLookAt() source code.


/*
 * SGI FREE SOFTWARE LICENSE B (Version 2.0, Sept. 18, 2008)
 * Copyright (C) 1991-2000 Silicon Graphics, Inc. All Rights Reserved.
 */
void GLAPIENTRY
gluLookAt(GLdouble eyex, GLdouble eyey, GLdouble eyez, GLdouble centerx,
          GLdouble centery, GLdouble centerz, GLdouble upx, GLdouble upy,
          GLdouble upz)
{
    float forward[3], side[3], up[3];
    GLfloat m[4][4];

    forward[0] = centerx - eyex;
    forward[1] = centery - eyey;
    forward[2] = centerz - eyez;

    up[0] = upx;
    up[1] = upy;
    up[2] = upz;

    normalize(forward);

    /* Side = forward x up */
    cross(forward, up, side);
    normalize(side);

    /* Recompute up as: up = side x forward */
    cross(side, forward, up);

    __gluMakeIdentityf(&m[0][0]);
    m[0][0] = side[0];
    m[1][0] = side[1];
    m[2][0] = side[2];

    m[0][1] = up[0];
    m[1][1] = up[1];
    m[2][1] = up[2];

    m[0][2] = -forward[0];
    m[1][2] = -forward[1];
    m[2][2] = -forward[2];

    glMultMatrixf(&m[0][0]);
    glTranslated(-eyex, -eyey, -eyez);
}


By the way, I am always confused by OpenGL coordinate system and matrix contents. You can even find the folloing in the OpenGL official we pages. http://www.opengl.org/resources/faq/technical/transformations.htm

Column-major notation suggests that matrices are not laid out in memory as a programmer would expect.
It explicitly said you will be probably confused.

I usually first think them with the standard math notation and then interpret to the implementation. Here I would like to use the same strategy. If you would like to implement this with C/C++, then you need to think about the memory layout of your matrices.

Figure 1 shows camera coordinates (up) and the bottom figure shows the camera coordinate has no relationship with the global/world coordinates.

We can assume a camera is a rigid body object, therefore, you can move and rotate the camera. You might be able to extend, shrink, or crash a camera, but, we don't think about such operation on a OpenGL camera.

Figure 1: Camera coordinates. Camera coordinates and other coordinates

Before going to create a matrix, I would like to say one more, about model and view duality. This is:

  • If the scene is static (doesn't move), but your camera goes to forward 1m.
  • If your camera doesn't move, but whole other scene moved 1m backword from the camera.

The camera see exactly the same scene in these two cases. You might experienced when you are at a station on a train, waiting for the departure, sometimes you mistook you are now moving or the other side train is moving. This is an example of the camera movement or the scene movement (the other side train moved). This object movement (model) and camera movement (view) are conbined to the matrix called model-view matrix in OpenGL.

This story is about camera position and posture matrix, therefore it should be 'view matrix,' however, OpenGL says this matrix is model-view matrix since model view duality. In a implementation it is called modelview matrix, though I only talk about a view matrix.

Next time, I would like to talk about the matrices generated by gluLookAt.