24 Dec 2008 @ 8:41 PM 
 

The effect of interocular distance and convergence distance on 3d perception

 

In order to really make the best use of this article, you may need to visit my actual blog if you're reading this on a mirror such as my livejournal, xanga pages or my LinkedIn, or via an RSS feed, since it contains embedded quicktime movies and such that don't always play nice across those mirrors.

You might also want to pick up a set of glasses like I'm linking over on the right, since you can't make much sense of anaglyph images if you don't have anaglyph glasses! (If you've swiped polarized glasses from a movie theater, those won't work with this.) If you already have a pair of them stashed away in a drawer somewhere, you can play along already! The cheap "throwaway" cardboard ones that sometimes come with comics or advertising materials will work, but the lens quality of real framed sunglass-style glasses will improve the experience.

Note that this article is an attempt to make a pretty technical subject as accessible as possible, so as much as possible I'll be keeping it straightforward and just noting a few things that seem important. There's much more technical information to be written about at another time, but after talking to a friend that's dealing with the same things himself on another project, and realizing that each studio will run into the same issues again and again until somebody writes this stuff down, maybe it was time to get a little of it onto the internet. Since I know the people that find this article are going to range from complete novices to experienced vfx industry professionals, the tone of the article may swing a bit. Rest assured, if it momentarily becomes too mundane or too technical, it'll roll back towards the middle in another paragraph or two!

To make something clear up front, when I refer to stereo films/cg/animation/movies/etc, I'm referring to stereography - the process of creating a 3d image in the mind of the viewer by the presentation of two or more 2d images. I tend to refer to projects as "stereo" projects vs. "3d" projects since I work in visual effects where we think of "2d" as a department and 3d as a variety of cg software package: texture and digital matte painters, roto and paint artists, compositors are 2d artists, and even on a stereo film, the "2d" department does just as much work as they would on a traditional "flat" film.

After working on an in-development feature a few months back, as well as the Hannah Montana 3D concert feature, and the upcoming "My Bloody Valentine: 3D" flick, one thing was really painfully clear: almost nothing about 3d presentation has really been entered into the record. What's been learned by the relative handful of people working on stereo movies has been kept to themselves. Each new team to confront the integration of live action with stereo cg has to invent the whole process from scratch all over again. It's like the beginning of cinema, which makes it exciting and frustrating at the same time.

If you're in the industry, the time to come to terms with stereo production is now: in five years time, we shouldn't expect to see a lot of flat films coming out of the major studios.

So let's jump in and look at a clip now and then talk about what it shows.

 


Consistent convergence and interocular distance between elements

 

 

There are two things that we discuss when talking about stereo cameras: interocular distance and convergence. These are what makes a stereo camera setup different from a traditional camera. All the same old things are there: film speed (though generally this is a CCD or CMOS sensitivity level, not an actual film speed), aperture, shutter angle (or something like it), and field of view. But interocular distance and convergence are new concepts to most people, even though 3d movies have been around for over 80 years.

The video above shows a rotating checkerboard with two geometric objects on it. The interocular distance on the shot is about human-eye normal (that is, the distance between the two "cameras" used in the CG scene is about the natural distance for human eyes). I should also note that this distance is constant in this shot. Some camera rigs, such as the Pace rig that's been quite popular in stereo film production recently, have the ability to vary the interocular distance while the shot is in process (usually in conjunction with a focus pull). When I first heard about this being touted as a promotional point, I was terrified: often, we consider ourselves lucky in visual effects if we were able to get accurate lens info, and now this? Turns out, it's not as hard to deal with as you might think - even if you don't have metadata from the shoot that pairs up interocular distance with each frame. I'll deal with the specifics of that in another post.

The convergence in the above shot is on the cone. Convergence is the distance in front of the eye-pair that each camera's view crosses the other. If you hold your finger about a foot in front of you and look at it, your eyes are converging on your finger. If you then shift your focus to an object beyond your finger, you will notice the image of your finger splitting into two images: this is because your eyes are no longer converging at your finger but rather at another object in the distance.

This is an important thing to note, here, that when composing a scene for stereo presentation, you should always try to set your convergence on the element in frame that you most want to draw the viewer's attention to. Failure to do this will increase complaints from your audience about headaches and eyestrain as they (usually) unconsciously try to compensate! You can feel and see the effect of this by playing the above video and trying to stare at the foreground edge of the checkerboard. The good news is that if this doesn't happen in camera, it's easy to make it happen in post. The dual plates that represent the two eyes of the camera can be panned horizontally until the desired convergence element merges. If the same element exists in both plates at the same position when overlayed, that's where the convergence is! Again, this is something I should cover in more detail in another article - but this info will be sufficient for many.

In the above shot, all of the elements were rendered with the same interocular distance and convergence distance. They appear solid, the geometric objects appear to rest on the checkerboard, and all is right with the world.

Let's see what happens when we try to fool Mother Nature:


Inconsistent convergence and interocular distance between elements

 

This is the same render of the checkerboard, but in the render of the elements that are sitting on it (which still exist in the same location as before) the camera pair converges just a little bit in front of where it did before. The effect of this is to make them appear farther away from the viewer. It's a subtle effect: our brain still wants to perceive the elements (which obscure the table and thus *must* be in front of it) as being closer to us, but it's fighting with itself: there's a sense that the elements are somewhat indented into the table. The cone is also harder to focus on because it's no longer the point of convergence.

More to come in a later entry...

Tags Tags: , , , , , , , , , , , , , ,
Categories: Uncategorized
Posted By: Eddie
E-mail | Permalink | Comments (1)
 18 Sep 2008 @ 7:09 AM 
 

Transform matrices: using matrix multiplication for transforming coordinates in 3d space

 

As I'd promised earlier, I'll be doing a quick run-through of how matrix multiplication can be used to rotate, scale and translate coordinates through 3d space. First, though, I'm going to write a little bit about a tool I've been working on and just launched. Sorry: the actual code for it is proprietary, so I can't share it, but a brief discussion of the tool and what it needed to be able to do will help illustrate why matrix transformation is so important, even when you're not creating a 3d package from scratch or developing a game engine.

I needed a way to quickly create a usable "right eye" camera to correspond to a "left eye" camera. I needed to be able to specify interocular distance (the distance between the left and right eye), as well as the convergence distance (where the eyes were focusing). Our eyes naturally "converge" on objects that are in front of us - this is part of how we judge relative depth as this convergence affects the apparent relationship if the dual images our eyes perceive of *other* objects that are not at the point of convergence. But that's a subject for another entry.

Anyway, the tool needed to be able to take a tracked "left eye" camera and apply that tracking data to generate the "right eye" camera using convergence and interocular information that was either derived from a track, noted on set, or dynamically assigned in Maya. I wanted a right eye camera that could be created instantly and that would automatically follow any changes I made to the left eye camera. If I smoothed that camera's path, I needed the right eye to follow suit, if I performed some dramatic transformation to that camera's path, perhaps extending its animation or re-animating an object that was part of the track while conforming the camera to that object's new path: I needed the right eye camera to keep up. If a cg element was being added, moving toward the camera, I needed to be able to selectively lock the convergence to "look at" that object's depth without diverting the cameras to actually look directly at it - similar to changing the focus depth.

It's all about control and instantaneous response, with a camera whose path is known with some degree of certainty and a second camera that will follow that one in a way that will produce realistic cg and will enable artists to quickly and accurately duplicate the second camera position when the original interocular distance and convergence information is not known.

Matrix multiplication, as a mathematical operation, isn't complicated - it's simple multiplication and addition, just repeated a number of times to generate the required result matrix. Since there are already more-than-ample tutorials online about how to actually *perform* matrix math (if you were cursed to do it by hand or develop procedures to support it in programming environments that don't already have matrix math support) I won't be covering that in detail. If my simple explanation doesn't quite do it for you, put "matrix multiplication" into Google if you're stumped.

I'm also going to avoid covering application in a specific language. Most recently, I was doing this in MEL for a project I'll discuss in a moment, but there are Python libraries for doing matrix math, TCL/tk support for it, PHP and Perl support - you'll rarely find yourself with no built-in or easily-added matrix support, though you may want to write (as I did) a number of routines to make it a little more accessible.

3D Translation MatrixTo the left, you can see the standard format of a translation matrix. This matrix (when multiplied by a coordinate matrix as represented on the right-hand side of the equation) will translate that coordinate into a new space, offsetting it by (tx, ty, and tz). In the typical way of matrix multiplication, the element in each row of the coordinate matrix [x y z 1] is multiplied by a column of the translation matrix, as shown here:

Translation Matrix ExplainedThe rows of this matrix are then added together (in typical matrix multiplication fashion) to give the resulting (x',y',z') location.

But there's more to matrix transformation than simple translation. If you wanted to find the new (x', y', z') for a translation like this one, simple addition would be enough. But manipulating points in 3d space is rarely that simple!

Fortunately, it's just as easy to scale a point using matrix math! Always away from the origin - we'll talk about how to scale a point away from somewhere other than the origin shortly. All matrix operations work with a similar setup, so you'll get used to seeing a similar notation here. Scaling matrix

This matrix is multiplied in the same fashion as the one we see above, rows against columns, with row sums producing (x',y', z') for the new location.

I'll admit, though. I don't use matrix math for scaling things very often. You know what I do use it for, though?

Rotating a point in space with respect to the origin! Isn't that exciting? I LOVE rotating things! Well, ok, so it's not that big of a deal - but when we start combining some of these things, it can turn a complicated object tree in your scene into a relatively simple expression.

I have to warn you, though, rotation is a bitch. Ever notice how your favorite 3d software has this whole "rotation order" thing? That's because if you rotate something 10 degrees in X, then 15 degrees in Y, it's not the same as rotating it in Y first and then in X. There's also a different matrix for each axis of rotation, so let's take a look. Same matrix math process as translation and scaling, but from this one we get rotation around the Z axis.

Matrix for z-Rotation

And now here:

Matrix for x-rotation

That one rotates in X! And lastly, as you'd expect, there's a matrix for rotating around the Y axis:

 

Matrix for y-rotation

Now, combining these matrix transformations together is as simple as multiplying the matrices with each other. Now, you build the series of multiplications in order from right to left, but they're carried out from left to right. For instance, for "zxy" rotation order, you would create an expression similar to the following: (I'll use MEL for this example: $xRot, $yRot, and $zRot are each 4x4 matrices that already contain transformation data for x, y and z rotation):

matrix $r[4][4] = $yRot * $xRot * $zRot;

Provided proper matrix variables are supplied, MEL supports basic matrix operations with some limitations that you'll find as you stretch your legs.

These can be strung together into much longer expressions generating much more complex matrices. To rotate an object about a point other than the origin, for instance, subtract the values of that coordinate from the coordinates of the object (transform it in {-cx, -cy, -cz} where {cx,cy,cz} was the center to rotate it around), perform the rotation, the transform it back {+cx,+cy,+cz}.

It may take some time learning to visualize and plan out a complex matrix transform - but what makes it powerful is the ability to combine any number of transformations into a single operation. Some basic trigonometry and a well-applied matrix transform can accomplish all kinds of things!

Tags Tags: , , , , , , , , , , , , , , , ,
Categories: geektalk, visual effects, visual effects pipeline
Posted By: Eddie
E-mail | Permalink | Comments (1)
 16 May 2008 @ 12:49 AM 
 

XML, Python and the Visual Effects Pipeline

 

I was talking to a friend today about what I'm doing with regards to managing data through an animation pipeline using XML. The more I work with it and the farther I get into the project, the more flexible and powerful the whole thing seems. Of course the goal to doing the implementation in Python is that virtually every software package in the vfx industry is python-friendly - so once the core routines are written, everything from Nuke and pyShake (the python plugin for Shake - if you haven't seen it yet, check it out here) to Maya, Houdini and RealFlow will be able to make use of them. I think most places are doing that these days, with a few nods to TCL/tk here and there - but broadly supported scripting languages are King and open description formats like XML are Queen.

My friend marveled at how nice it would be if one day, a couple years from now, everything was able to talk that smoothly: that a character animated in Maya could be pulled into Houdini, for instance, as something other than an OBJ sequence or a separately rigged character that you had to tediously (or with a lot of specific coding) link to exported channel data.

I wonder if that interoperability thing will ever extend beyond each individual studio's implementation. Everybody has a way of getting software to talk amongst themselves, some solutions being more elegant than others, but when you invest in creating something as elaborate as this it becomes your own proprietary tool. If you develop a tool that an animator can take an animated character with a complex rig on it, arbitrarily select additional elements that were never *really* meant to be animated and animate them anyway, and the modeling team can modify the model and issue a new version of it - and the animation gets seamlessly transferred over to the new model, even able to be read into RealFlow, substituting a different set of low poly independent objects that are driven by the data in that XML file: you don't put that pipeline tool on the internet for everyone to download for free.

That tool becomes your secret weapon. As a studio with an investment in a powerful and unique proprietary tool, even charging for it may not mean as much to you as the edge you gain during the heat of production.

Being XML based and implemented in Python does put my current project a wee bit closer to being an open standard, though. Even Shake will take Python scripts now - and they're really powerful in it and getting more so as development continues. The readability thing for XML is a gigantic plus, and the way it represents data is great. I can build a module that will write out the translation of a locator in both world and local space, as a baked set (every frame has a value) and as a set of keyframes (values only for those frames where the value was explicitly set by the artist), as well as screenspace UV values - so the same XML file could reconstruct a scene for a lighter to light and render from or another animator to tweak the animation curves, or for RealFlow to drive low-poly proxy objects with to disturb a drifting mist, or for a compositor in Toxic to link an effect to. And it's all one XML file - not a half dozen formats (often multiple versions of each) and a hundred-unit sequence of geometry exports.

Tags Tags: , , , , , , , , , , , , , , ,
Categories: geektalk, python, visual effects, visual effects pipeline, xml
Posted By: Eddie
E-mail | Permalink | Comments (3)
\/ More Options ...
Not Logged In.
  • Role »
  • Posts »
  • Comments »
Change Theme...
  • VoidVoid (Default)
  • LifeLife
  • EarthEarth
  • WindWind
  • WaterWater
  • FireFire
  • LiteLightweight