Transform matrices: using matrix multiplication for transforming coordinates in 3d space
Sep 18, 2008 in geektalk, visual effects, visual effects pipeline
As I'd promised earlier, I'll be doing a quick run-through of how matrix multiplication can be used to rotate, scale and translate coordinates through 3d space. First, though, I'm going to write a little bit about a tool I've been working on and just launched. Sorry: the actual code for it is proprietary, so I can't share it, but a brief discussion of the tool and what it needed to be able to do will help illustrate why matrix transformation is so important, even when you're not creating a 3d package from scratch or developing a game engine.
I needed a way to quickly create a usable "right eye" camera to correspond to a "left eye" camera. I needed to be able to specify interocular distance (the distance between the left and right eye), as well as the convergence distance (where the eyes were focusing). Our eyes naturally "converge" on objects that are in front of us - this is part of how we judge relative depth as this convergence affects the apparent relationship if the dual images our eyes perceive of *other* objects that are not at the point of convergence. But that's a subject for another entry.
Anyway, the tool needed to be able to take a tracked "left eye" camera and apply that tracking data to generate the "right eye" camera using convergence and interocular information that was either derived from a track, noted on set, or dynamically assigned in Maya. I wanted a right eye camera that could be created instantly and that would automatically follow any changes I made to the left eye camera. If I smoothed that camera's path, I needed the right eye to follow suit, if I performed some dramatic transformation to that camera's path, perhaps extending its animation or re-animating an object that was part of the track while conforming the camera to that object's new path: I needed the right eye camera to keep up. If a cg element was being added, moving toward the camera, I needed to be able to selectively lock the convergence to "look at" that object's depth without diverting the cameras to actually look directly at it - similar to changing the focus depth.
It's all about control and instantaneous response, with a camera whose path is known with some degree of certainty and a second camera that will follow that one in a way that will produce realistic cg and will enable artists to quickly and accurately duplicate the second camera position when the original interocular distance and convergence information is not known.
Matrix multiplication, as a mathematical operation, isn't complicated - it's simple multiplication and addition, just repeated a number of times to generate the required result matrix. Since there are already more-than-ample tutorials online about how to actually *perform* matrix math (if you were cursed to do it by hand or develop procedures to support it in programming environments that don't already have matrix math support) I won't be covering that in detail. If my simple explanation doesn't quite do it for you, put "matrix multiplication" into Google if you're stumped.
I'm also going to avoid covering application in a specific language. Most recently, I was doing this in MEL for a project I'll discuss in a moment, but there are Python libraries for doing matrix math, TCL/tk support for it, PHP and Perl support - you'll rarely find yourself with no built-in or easily-added matrix support, though you may want to write (as I did) a number of routines to make it a little more accessible.
To the left, you can see the standard format of a translation matrix. This matrix (when multiplied by a coordinate matrix as represented on the right-hand side of the equation) will translate that coordinate into a new space, offsetting it by (tx, ty, and tz). In the typical way of matrix multiplication, the element in each row of the coordinate matrix [x y z 1] is multiplied by a column of the translation matrix, as shown here:
The rows of this matrix are then added together (in typical matrix multiplication fashion) to give the resulting (x',y',z') location.
But there's more to matrix transformation than simple translation. If you wanted to find the new (x', y', z') for a translation like this one, simple addition would be enough. But manipulating points in 3d space is rarely that simple!
Fortunately, it's just as easy to scale a point using matrix math! Always away from the origin - we'll talk about how to scale a point away from somewhere other than the origin shortly. All matrix operations work with a similar setup, so you'll get used to seeing a similar notation here. 
This matrix is multiplied in the same fashion as the one we see above, rows against columns, with row sums producing (x',y', z') for the new location.
I'll admit, though. I don't use matrix math for scaling things very often. You know what I do use it for, though?
Rotating a point in space with respect to the origin! Isn't that exciting? I LOVE rotating things! Well, ok, so it's not that big of a deal - but when we start combining some of these things, it can turn a complicated object tree in your scene into a relatively simple expression.
I have to warn you, though, rotation is a bitch. Ever notice how your favorite 3d software has this whole "rotation order" thing? That's because if you rotate something 10 degrees in X, then 15 degrees in Y, it's not the same as rotating it in Y first and then in X. There's also a different matrix for each axis of rotation, so let's take a look. Same matrix math process as translation and scaling, but from this one we get rotation around the Z axis.

And now here:

That one rotates in X! And lastly, as you'd expect, there's a matrix for rotating around the Y axis:

Now, combining these matrix transformations together is as simple as multiplying the matrices with each other. Now, you build the series of multiplications in order from right to left, but they're carried out from left to right. For instance, for "zxy" rotation order, you would create an expression similar to the following: (I'll use MEL for this example: $xRot, $yRot, and $zRot are each 4x4 matrices that already contain transformation data for x, y and z rotation):
matrix $r[4][4] = $yRot * $xRot * $zRot;
Provided proper matrix variables are supplied, MEL supports basic matrix operations with some limitations that you'll find as you stretch your legs.
These can be strung together into much longer expressions generating much more complex matrices. To rotate an object about a point other than the origin, for instance, subtract the values of that coordinate from the coordinates of the object (transform it in {-cx, -cy, -cz} where {cx,cy,cz} was the center to rotate it around), perform the rotation, the transform it back {+cx,+cy,+cz}.
It may take some time learning to visualize and plan out a complex matrix transform - but what makes it powerful is the ability to combine any number of transformations into a single operation. Some basic trigonometry and a well-applied matrix transform can accomplish all kinds of things!
