'Inspired 3D': Compositing Techniques and Methods — Part 1

From the Inspired 3D series, David Parrish tackles compositing techniques and methods.

All images from Inspired 3D Lighting and Compositing by David Parrish, series edited by Kyle Clark and Michael Ford. Reprinted with permission.

This is the fourth in a number of adaptations from the new Inspired series published by Premier Press. Comprised of four titles and edited by Kyle Clark and Michael Ford, these books are designed to provide animators and curious moviegoers with tips and tricks from Hollywood veterans. The following is excerpted from Lighting and Compositing.

Basic Compositing Operations

Compositing at its most basic level is the combination of images. In this book, those images are referred to as layers. The way those layers are combined is important in creating the final look of the image. Compositing software packages have many functions for combining layers, but a solid understanding of the basic ones goes a long way in helping the compositor make the most of the input images. The following examples provide a brief explanation along with a visual representation of the most common layering processes found in compositing software packages. (The mathematics behind each function is described in the Appendix.) For the purposes of consistency, each example in this section utilizes the same simple, computer graphics elements. The elements are a building block, which will be the A input for each example (see Figure 19), and a beach ball, which will be the B input (see Figure 20).

[Figures 19 & 20] A building block element (left) and a beach ball element (right).

Over and Under

To begin with, examine one of the most commonly used functions in any compositors toolbox: the over function. The most basic compositing script can be described as A over B. In the simplest terms, this means putting one element (called A) over another element (called B). The key to understanding how the over function works lies in the alpha channel and the concept of pre-multiplied images. The alpha channel defines the portion of the image that is maintained during the over-processing. The black portion of the alpha is what gets thrown away, whereas the white portion is what gets composited. This is true, however, only if the image is pre-multiplied, which means multiplying the alpha channel values times the color channel values. Where the alpha is black (a value of zero), this multiplication yields a result of zero. Where the alpha is white (a value of one), this multiplication yields the exact same color value seen in the red, green, and blue channels. Some software packages automatically pre-multiply an image when the over function is applied, and others do not. Without pre-multiplied images, the over function will not work as expected, so it is vital to determine whether the software performs this function automatically.

The next step in the over process is taking the alpha channel of the A element, inverting it, and multiplying it with the B element, or the lower layer of this composite. This multiplication creates a hole in the B element, in which the A element will fit perfectly. With the hole cut out of the B-side of the composite, the A and B image can now be added together. The resulting image is the A element over the B element (see Figure 21). Fortunately, the software does most of the work, and the compositor simply specifies the A and B sides for the over function. The primary task for a compositor when utilizing the over function is dealing with the edges of the objects being composited. Where the alpha channel is pure white or pure black, the result is cut and dry. At the edges, where the alpha transitions between black and white, the compositors job becomes more complex. As the alpha values transition, the pre-multiplication process modifies the values in the red, green and blue channels, adding a fraction of the foreground color to the inverse fraction of the background color. If the alphas edge transition is not correctly positioned or valued, the effect is a noticeable brightening or darkening of the color values at the edge of an object. Attention to edges is critical to creating good composites.

The sister function to the over function is the under function. It is, in essence, the same function performed in reverse. Instead of placing the A side over the B-side, this function does just the opposite, placing the A side under the B- side (see Figure 22). To achieve the same result, the inputs to the over command can be switched, making it a B over A comp (yielding the exact same results as an A under B comp). Many compositing functions have a counterpart to perform the complementary mathematical calculations. For each operation listed here, the most commonly used and referred to operation is explained first, and the less common operation is described afterward.

[Figures 21 & 22] An over function, with the block as the A-side and the beach ball as the B-side (left). An under function, with the block as the A-side and the beach ball as the B-side (right).

replace_caption_lighting05_23.jpg

Add and Subtract

The add function is a bit simpler than the over function. With the add function, each channel value from the A side is simply added to the B-side (see Figure 23). For areas in which the values added together exceed the maximum allowable value (pure white), the value is clamped to the highest value. For this reason, bright images that are added together often yield the result of a predominately white image. All of the color channels (red, green, and blue), as well as the alpha channel from the A side are added to the corresponding channel of the B-side. Referring to the example shown in Figure 23, because the alpha is either black or white for each of the two images, the addition of the alpha channels produces a simple result. Black, or zero, added to black is black, and because the values are clamped at the maximum, white added to white yields white. As with the over function, there can be problems at the edges where alpha values transition from black to white. Adding two images where the alpha channels are each at 50% will produce a completely white alpha. A 50% alpha basically only allows 50 percent of the color values to come through in compositing functions. When two 50% alpha channels are added together, 100% of the color channels come through and can create unwanted bright edges around elements added together. When using an add function, it is often necessary to replace the alpha channel of the resulting image, such as with one from the original input images.

The add function is useful for adding extra lighting effects to CG characters. Rendering a separate specular pass for a particular light provides an element that can be added into the original character. Because the alpha channels between the fully rendered character and the specular-only pass should match, the alpha can simply be replaced with the alpha from the original input image after the add function is performed. This is a powerful tool for adding detail elements and maintaining separate control of those elements within a comp script.

[Figures 24 & 25] A subtract function with the beach ball being subtracted from the block (left). A subtract function with the block being subtracted from the beach ball (right).

replace_caption_lighting08_26.jpg

The opposite of an add function is a subtract function. This function takes each value from the A input side and subtracts the same channels information from the B-side. This function can create visually unusual results (see Figure 24), and I have rarely found a use for it in a production environment. Notice the areas in which the B image extends beyond the boundaries of the A image; the subtract function completely eliminates the values of the B-side (because the values outside of the cube are zero, and subtracting from them gets clamped to zero). It is also interesting to note that the brightest areas of the ball, where it was white, now become the darkest areas of the resulting image. If the A and B inputs are reversed and the beach ball becomes the A side with the cube becoming the B-side, the result is quite different (see Figure 25). The beach ball now forms the boundaries of the resulting image with a portion of the block appearing within it. If you thought the add function created unusual alpha channels, wait until you see what the subtract function does. Because the alpha values are subtracted, the resulting alpha is rarely useful for compositing the resulting image later in the comp script (see Figure 26). The confusing part is that the alpha channel resulting from a subtract operation does not match up with the color channels. In most compositing operations, the alpha channel mimics the outline of the color channels. Because the subtract function negates that correlation, utilizing an alpha channel resulting from the subtract function can be confusing.

[Figures 27 & 28] An inside function (left) with the block being placed inside the beach ball. An outside function (right) with the block being placed outside the beach ball.

Inside and Outside

The inside function takes the input from the A side and places it inside of the B-sides alpha channel. The color channels from the B-side of the function are not used at all and have absolutely no effect on the resulting image (see Figure 27). Assuming that each shape has a solid alpha with the same shape as its color channels, the resulting image will have an alpha channel matching the resulting shape of the building blocks color channels. If the A input to the inside function has no alpha, the inside function will still produce the same results in the color channels but will have a blank alpha channel. This is because the actual operation is a multiplication of all the A-side channels by the B-side alpha. The inside function is often shortened and referred to as an in function.

The reverse of an inside function is an outside function. The outside function takes the inverse of the alpha channel from the B-side of the function to multiply with the A-side (see Figure 28). As with the inside function, the outside function does not use the color channels from the B input at all. Only the alpha channel from the B input is utilized. The outside function is also commonly shortened, and referred to simply as an out.

[Figures 29 & 30] A mix function (left) with 75% of the cube and 25% of the beach ball. A multiply function (right) with the block being multiplied by the beach ball.

Mix

The mix function combines two images together based on a percentage input by the compositor. The percentage represents the amount of the A input that will be included in the resulting image. The percentage remaining to reach 100% is the amount of the B input that will be mixed into the output image. For example, if the input number for the mix function is 75%, then the resulting image will include 75% of the A image and 25% of the B image (see Figure 29). This function applies the percentage to the color channels as well as the alpha.

Multiply and Divide

The multiply function multiplies all of the channels in the A input image by their respective channels in the B input image. Because values are multiplied together, any pixel location with a zero value in either image results in a zero pixel value in the output image. This means that the resulting image can only have values where the two input images overlap (see Figure 30). Any area in the A image multiplied with almost any corresponding area in the B image will be darkened by the multiplication. This is true because the values in each channel are normalized (placed in a range from 0 to 1), meaning two decimals are usually being multiplied together. The only instance in which darkening does not occur is when corresponding areas in both images are pure white, giving them each values of 1, and producing and output value, which is also 1. Multiplication of color channels produces results that are not intuitive to how the eyes interpret color. For this reason, I rarely use the multiply function in normal compositing tasks.

[Figures 31 & 32] A max function (left) with the block and the beach ball as the inputs. A min function (right), with the block and the beach ball as the inputs.

replace_caption_lighting15_33.jpg

The opposite of the multiply function is the divide function. The divide function produces an image with value only where the A input image has values. The B input will appear only in the areas in which it overlaps the A input (see Figure 31). The areas in which the A image exists outside of the B image (the bottom-right corner of the block in Figure 31) appear as an unchanged version of the A input image. This area has values in the channel for the A input but no values for the B input. This produces the illegal operation of division by zero, so the compositing software simply makes the choice of preserving the existing value for the output image.

Max and Min

The max function takes the maximum value from the two inputs and assigns that value to the output image. The brightest portion of each image is used in the resulting image (see Figure 32). Unlike many of the previous examples, the order of application does not matter with this function. Switching the A and B inputs will yield the exact same result. Given two images with solid alpha channels defining their shapes, the resulting alpha channel of a max function will be the combination of the two input image shapes.

The inverse of the max function is the min function. Because this takes the minimum values of each input, the resulting image will only have values where the two input images overlap. In those areas, each pixel will be the lowest between the two inputs (see Figure 33). The alpha channel will only exist in the overlap as well, because in all other areas, one of the alpha channels is 0.

These are the more common compositing operators found in software packages. During everyday compositing, the over function will likely be used more than all of the other operators combined. The over function is a powerful tool and is definitely worth studying in depth. Understanding the mathematics behind the operation, as described in the Appendix provides the compositor with an excellent point of reference for gaining control over the process of layering images.

Depth Cues and Contrast

As mentioned in Chapter 5, Tools of the Trade, depth cues are a primary method for clarifying the three-dimensionality of a scene within a two-dimensional image. Depth cues provide the viewer with information to indicate which elements in the scene are closer and which are farther from the camera. These cues can be occlusions of objects by another, relative sizes of objects, reduced clarity, blurring, haze and atmosphere, to name a few. All of these cues can be introduced or enhanced during the compositing stage of a shot. The compositional basics presented in Chapter 5, Tools of the Trade, also play an important role in implementing these depth cues. The first step (youll start to see a pattern here) is observation. As lighting requires the study and observation of how the human eye perceives illumination, compositing requires those same steps to understand how to combine layers in a realistic manner. Once the rules are learned, they can be stretched or broken to serve the intent of the artist. A strong understanding of how camera lenses and the human eye interpret the world and its three-dimensionality will give the compositing artist a visual vocabulary with which to build images. Compositing is a construction process with images, elements, and compositing operations being the layers with which you build.

Observing the world around you is the first step, so studying and breaking down the depth cues in a traditional photograph is a good starting point. The photo in Figure 33 shows many of the depth cues seen every day but in many instances, taken for granted. The specific depth cues examined here are the following:

Overlap Scale Level of detail Depth of field Contrast Atmosphere

[Figure 34] A photograph displaying various depth cues.

Overlap

There are several distinct layers represented in this photograph (Figure 34), with multiple foreground elements, a distinct midground, a background, and a distant background. Starting with the foreground and working backward into the image, look at the oblong light mounted on the wooden post. The most basic indication that this object is in the foreground is the fact that it is obscuring other parts of the scene. The light fixture overlaps the shrubbery, water and dock retaining wall. Based on experience viewing the world, our eyes interpret this overlap as a clear indicator that the light fixture is closer to the camera than the other objects. With a photograph, the elements automatically overlap each other correctly according to their placement in the scene. In a composited image, however, the compositor chooses the layering order of elements. If the light fixture is element A and the shrubbery is element B in a composite script, placing A over B is just as easy as placing B over A. Layering the shrubbery over the light fixture would be possible, and due to the close proximity of the two layers in the scene, it might even look correct. It would, however, drastically change the composition and emphasis of the shot, with just a small portion of the light fixture peeking over the shrubs. A compositor must always maintain a clear understanding of the relationships between objects in three-dimensional space and attempt to layer the elements accordingly.

Scale

Another depth cue for this foreground element is its scale. The size of the light fixture, relative to the other objects in the scene, gives the viewer a clue to its proximity to the camera. If an element is larger in frame relative to another assumed comparably sized object in the scene, then it appears to be closer to the camera. This effect takes into account a viewers background knowledge and sense of scale with known objects. For instance, even though it may not be of a type specifically seen before, a light fixture has a certain size in most peoples minds. Scale has a great deal to do with human perception and experience, and the way we are accustomed to seeing things. Because the world is taken in through the eyes and processed with the brain each day, familiar items are categorized in terms of their scale. The trees in the midground of the photo are a good example. The trees relative to the scale of the light fixture are actually smaller in this photograph. The mind interprets the scale of the trees to mean they are much farther away from the camera than the light fixture.

Level of Detail

Along with scale, it is important to note the level of detail of each layer within a scene. Every portion of a scene displays details to help the viewer understand how far it is from camera. Certain objects display a greater level of detail, giving a clue as to their distance from the viewer. In Figure 33, the plant life along each bank of the waterway provides a good comparison of level of detail. The plants on the bank nearest to the camera can be clearly resolved into leaves, flowers and branches. The plants on the opposite bank, however, can only be seen as a green color and a basic outline. When rendering elements for a computer graphics scene, this type of depth cue not only helps a composition come together more realistically, but also presents an opportunity to save rendering time. The objects nearest camera need attention to detail in their textures, lighting and resolution. The same object placed farther away in the scene requires less attention in each of those areas. Whereas a foreground object may require rendering at a resolution of 2k, with high-resolution textures and large shadow maps in the lights, a midground object may only require a 1k render, with medium resolution textures and much smaller shadow maps. Each of those optimizations can save a tremendous amount of render time. Attention to detail in composition is important but not in areas in which it goes unnoticed. Spending time on the texture and lighting quality of foreground objects will usually be more important to establishing the quality level of a shot than using the same attention to detail in midground and background elements. There are a couple of exceptions to this: when a foreground element moves quickly through the scene and is highly motion-blurred, or when a midground or background element will be used in another shot in which it appears closer to camera.

[Figure 35] A photograph with a narrow depth of field.

Depth of Field

Another depth cue, which often dictates whether the level of detail is discernable, is depth of field. A camera lens, along with the human eye, can only focus on a specific distance from the lens, and anything in front of or behind that point is not in precise focus. The range that appears in focus is dependent on the aperture and focal length of the camera recording the image. A shot taken from exactly the same point as in Figure 34 can yield a very different result with a much lower aperture setting, which results in a narrow depth of field (see Figure 35). This new image has a narrow depth of field with focus centered on the light fixture in the foreground. Notice that the range of focus is so narrow that even the foreground shrubs are now becoming blurred. This narrow range of focus can also serve to trick the viewer into thinking the midground trees on the opposite side of the waterway are farther away than they really are. Because the human eye often maintains a broad depth of field, the interpretation when viewing an image with a narrow depth of field can be that of a drastically increased distance between layers in the scene. Another difference between Figures 34 and 35 is in the level of detail on the foreground light fixture. At first glance, Figure 34 appears to represent the majority of the scene in focus, including the light fixture. When comparing it with Figure 35, however, it is clear that the level of detail in the foreground object is now much higher. The corrugation on the interior portion of the light is now evident, as are the screws at the base and the grain on the wooden support post. If the light fixture is the point of interest for this shot, then Figure 34 has failed to fully display its level of detail, despite the greater overall depth of field. The compositor must always keep in mind the intended emphasis of a shot and avoid operations that will diminish or obscure important elements in a scene.

Contrast

Contrast is an often overlooked depth cue. The level of contrast in an element and between elements in a scene helps the viewer place the objects in 3D space. Contrast is closely related to each of the previous categories, in that each can increase or decrease the perceived level of contrast within an element or between elements. Contrast is the amount of difference between the brightness of different elements. The greater the difference is between elements, the higher the level of contrast. Unfortunately, the human perception of contrast is affected by many variables which makes contrast a much more complicated topic. Outside factors, such as color, saturation, and illumination can change the perception of contrast within a scene. As a general rule, elements closer to the camera appear higher in contrast than those farther away. This is tied closely with level of detail, which allows the viewer to resolve an element in clearer focus and thereby perceive it as having crisper edges and higher contrast and color saturation values. Objects closer to the camera do not necessarily have higher contrast or saturation values, but the camera lens, as well as the human eye, is able to resolve closer objects with more detail than those farther away. Be aware that juxtaposition can also affect the perception of color and brightness. A solid white object appears brighter when placed on a solid black background, as opposed to a light gray background. Likewise, a blue element appears more saturated and vibrant when surrounded by its complementary color of orange, as opposed to a color such as purple or green. The Art of Color, by Johannes Itten, provides an excellent reference for studying the principles and perception of color.

Atmosphere

The last depth cue in this discussion is atmosphere. Looking back at Figure 33, notice the two most distant layers in the background, the tree-covered hill and the peak in the distance. Each of these layers displays the effects of atmosphere on elements far from the camera. Depending on the amount of moisture in the air, the atmosphere will reduce the contrast and clarity with which distant elements are perceived. The layer of background trees on the hills appears to have much less contrast than the trees in the foreground, due to the effects of the atmosphere on the resolution. The most distant peak in the photo shows even greater atmospheric effects, with almost no detail or contrast visible. Each of these elements also begins to take on the value of the area surrounding them in this case the sky. The water and particles in the atmosphere reflect the color of the sky, and as the distance from the camera increases, this effect becomes more pronounced. With elements in this scene extending farther back in the distance, dark areas become bright or washed out and brighter areas become duller. Comparing the distant trees with the palm trees in front of them, the difference in perceived contrast is clear. The palm trees can be more clearly delineated, and the atmospheric hazing effect is less because they are much closer to the camera. In a comp script, the effects of the atmosphere on distant elements can be simulated by blurring the elements, decreasing their contrast levels and by adding a sky-colored haze. Atmosphere, along with each of the other depth cues, can help in adding a great deal of realism to computer graphics elements and composites.

Continue to Compositing Techniques and Methods Part 2.

To learn more about lighting and compositing and other topics of interest to animators, check out out Inspired 3D Lighting and Compositing by David Parrish; series edited by Kyle Clark and Michael Ford: Premier Press, 2002 (266 pages with illustrations). ISBN 1-931841-49-7 ($59.99) Read more about all four titles in the Inspired series and check back to VFXWorld frequently to read new excerpts.

Author David Parrish (left), series editor Kyle Clark (center) and series editor Mike Ford (right).

David Parrish went straight to work for Industrial Light & Magic after earning his master's degree from Texas A&M University. During the five years that followed, he worked on several major films, including Dragonheart, Return of the Jedi: Special Edition, Jurassic Park: The Lost World, Star Wars: Episode I The Phantom Menace, Deep Blue Sea, Galaxy Quest and The Perfect Storm. After five years with ILM and a short stay with a startup company, he was hired by Sony Pictures Imageworks to work on Harry Potter and the Sorcerer's Stone.

Series editor Kyle Clark is a lead animator at Microsoft's Digital Anvil Studios and co-founder of Animation Foundation. He majored in film, video and computer animation at USC and has since worked on a number of feature, commercial and game projects. He has also taught at various schools, including San Francisco Academy of Art College, San Francisco State University, UCLA School of Design and Texas A&M University.

Michael Ford, series editor, is a senior technical animator at Sony Pictures Imageworks and co-founder of Animation Foundation. A graduate of UCLAs School of Design, he has since worked on numerous feature and commercial projects at ILM, Centropolis FX and Digital Magic. He has lectured at the UCLA School of Design, USC, DeAnza College and San Francisco Academy of Art College.