Search form

Audio for Animation

Mary Ann Skweres looks into an essential part of the animation production process sound.

Animation sound has evolved with the digital revolution, but despite sweeping advances in technical production, the vocal performance remains at the heart of animation audio.

Animation sound has evolved with the digital revolution, but despite sweeping advances in technical production, the vocal performance remains at the heart of animation audio.

Animation is one of the most popular and profitable genres in entertainment. Animated films have been among the top grossing films in the last five years. Television animation has been a mainstay of childrens programmers pretty much since the advent of the tube. In animation all the sound is completely invented in post a synthetic creation, manufactured out of raw materials. The best sound design is a hidden art, a seamless integration with the moving image, existing in support of the story.

Like many film and television techniques, the creation of animation sound has evolved with the digital revolution of the last 20 years. Paul Vitello of Vitello Prods. in Burbank, California, first started working in animation in 1984 on the animated television series, Voltron. It was the first television cartoon produced in stereo even before the FCC approved the technical format that we now take for granted. In their excitement over the use of that innovative technique, Vitello admits, We panned everything. You could get whiplash listening to the shows. We wanted to get movement in the shows, but it only really worked if it was motivated.

The control booth for Studio A at L.A. Studios Groups facility. The company specializes in anything audio, especially voice recording on big budget animation features and television series.

The control booth for Studio A at L.A. Studios Groups facility. The company specializes in anything audio, especially voice recording on big budget animation features and television series.

But despite sweeping advances in the technical production, the vocal performance remains at the heart of animation audio. The human aspect of these films is the voice... People believe that voice, says Carlos Sotolongo, dialog mixer at the L.A. Studios Group, which specializes in anything audio, especially voice recording on big budget animation features and television series. Sotolongo has credits on just about every DreamWorks animated project, including the Shrek movies, Shark Tale, Prince of Egypt and Madagascar, plus work for other studios on films such as Nickelodeons Jimmy Neutron: Boy Genius and Universals Curious George.

In feature animation, the audience expects the high quality vocal performances that A-list stars bring to the film. One of the most difficult things for Sotolongo is being prepared because every session is different and when you are working with stars, You have to be ready to go and right. Animated feature film projects can take three to four years and require several passes on the vocal recordings. The process is an elaborate one.

Trackreading, a method of animation lip synching mostly used in traditional 2D animation, is a dying art with computer-generated animation programs observes Tim Borquez of Hacienda Post.

Trackreading, a method of animation lip synching mostly used in traditional 2D animation, is a dying art with computer-generated animation programs observes Tim Borquez of Hacienda Post.

Once a rough draft of a script is written, the directors and producers will bring in actors not necessarily the final actors to record the whole thing so that they can hear it off the page. There can be as many as16 people in the booth at the same time, recording on 16 different mics. This group recording allows the actors to play off one anothers performances to achieve the comic timing required by the story. It usually takes a couple of days to record the whole script that way. The creative team then leaves, makes their tweaks and casts the film.

At this point, the final actors will come in. Depending on the movie, it can be one at a time or several people at a time. L.A. Studios is flexible enough to accommodate a large group of people as well as just one actor at a time. The facility has the ability to record several people at once, but still maintain complete isolation because they have multiple isolation booths. This allows the actors to jump on each others lines, but still have a clean recording of each performers lines.

Once the voice is recorded, the tracks are slugged, meaning the action and dialog is paced out against the storyboards. Then the process moves on to trackreading phonetically transcribing frame-by-frame every word of dialog, then measuring it onto animation exposure sheets for the artists to animate. Films are animated to the voice, a small portion at a time. Between all these steps, they come back and re-record because there are tweaks to the script or lines have changed. The feature picture editors work with the audio and storyboards and lay in the various passes.

There is some leeway to animation lip sync. In animation there are normally five or six pre-determined mouth positions. Sometimes animators will get more detailed with the mouth, but usually they assign these six mouth positions known as lip assignment based on the phonetic transcription. The animation might have hard sync at the starting and ending positions, but the animation in the middle of a line can be looser. Trackreading is a method mostly used in traditional 2D animation.

With computer-generated animation programs, it is becoming a dying art, according to Tim Borquez of Hacienda Post. Borquez worked with veteran sound editor, Sam Horta of Horta Editorial, beginning in the days of 35mm film and mag tracks and transitioning to tape and digital formats. In 1999, after his mentor passed away, he formed Hacienda Post, a boutique house that provides full-service sound to the entertainment industry.

For dialogue mixer Carlos Sotolongo of L.A. Studios Studio, the positioning of the microphone is very important, as is the choice of microphone.

For dialogue mixer Carlos Sotolongo of L.A. Studios Studio, the positioning of the microphone is very important, as is the choice of microphone.

In Flash animation there generally is no need for trackreading. Lip assignments are made on the fly from standardized mouth positions already programmed into the computer. Current computer-based animation platforms such as Maya have lip assignment tools. They can read a track and create a first pass for the animators of mouth position based on how the track is analyzed by the program. Borquez explains, It will actually listen and then say, That sounds like a T and that sounds like an A. That A came after a T so we will assign it mouth position four. The computer will actually do it. All the animator needs to do is go through and check it to see how the computer did on its pass. The animator can tweak the pass.

Although Borquez has worked extensively in live-action, the companys bread and butter has been animation. He got started as a music editor in the early 1980s on shows such as Fat Albert and the Cosby Kids, He Man and the Masters of the Universe and She-Ra: Princess of Power and then moved on to sound editing and mixing. Noted for his work in animation, Borquez has worked for animation giants such as Disney and DreamWorks. He was supervising sound editor on shows that include The SpongeBob SquarePants Movie, Tom and Jerry Blast Off to Mars and Bionicle 2: Legends of Metru-Nui for which he won a 2005 Motion Picture Sound Editors (MPSE) Golden Reel award for Best Sound Editing in Direct to Video.

Unlike live-action, where a large part of the audio is recorded during the shooting of a scene, the technical recording of dialog in a studio is a key component to making the animated film believable. Sotolongo explains, You have to make it sound natural. If it is a shot of characters in the woods, you have to mic it and prepare the room so that it actually conveys the feeling of being in the woods. When you listen to it, it is not something obviously that was recorded in a small room.

L.A. Studios Studio A is a very large room, about 1,250 square feet. It also has isolation booths. The positioning of the microphone is very important, as is the choice of microphone. Sotolongo usually uses the Neuman U8676, which has been the industry standard for years. It is a full range microphone so that a deep voice will come across, but it is also very flat on the high end. When recording they will use three or four mics for every single actor if they are recorded one at a time in order to create the spatial ambiance so that the editor, and ultimately the mixer, can choose what perspective the audio is coming from. They have been using computer-based mixing programs since the early 1990s including ProTools and the Sony DMXR-100 fully automated mixing consoles.

Paul Vitello, (left) with R.D. Floyd of Paul Vitello Prods., finds that with most TV animation, the best results come when actors match their performance to the action of the storyboards.

Paul Vitello, (left) with R.D. Floyd of Paul Vitello Prods., finds that with most TV animation, the best results come when actors match their performance to the action of the storyboards.

CD-quality sound 16-bit, 48K, a low-resolution standard can cause problems in recording vocal audio performances. R.D. Floyd, sound supervisor and audio engineer of Paul Vitello Prods. prefers Sennheiser 421 microphones for vocal recording. These dynamic mics are able to handle vocal performances that range from low to screaming. Floyd says, The U87 is a wonderful microphone and is ideally one of the best vocal mics, but because of its sensitivity and dynamic range, it picks up everything. It was okay in the analog domain. The tape and electronics was more forgiving and smoother and warmer.

What happened in the digital domain with 16-bit and 48K, the sample rates of converting analog sounds to the digital domain is a bit gritty and edgy at 16-bit. It just so happens that some of the sounds that take place in the mouth, in the teeth, in the tongue the sounds of the human instrument are very snappy, spiky, little sibilant ticks and pops, which sound like digital timing errors on tape. This is a very ugly sound because you have little microsecond noises that are always there. You can hear them if you put your ear close to someones mouth, but this isnt complimentary in the animation. Higher quality hi definition sound formats such as 24-bit, 96K sample rates are helping to smooth out the problem, which is common in the industry, bringing reproduced audio back to the mastering quality of early 1980s analog studio recordings.

Once the dialog and animation on a film has been locked, the film moves into a post-production sound process equivalent to live-action except that all the sound is created. Randy Thom of Skywalker Sound, the creative force behind the sound design for both The Polar Express and The Incredibles, says, The beginning of sound design is always in the script and in the directors vision of the movie.

One of the tricks to making animation seem real is by using real sounds. Although the film included elements of sci-fi, for The Incredibles, Thom grounded the audio design in a realistic, naturalistic style. Sound effects were real recorded sounds, never electronically produced. According to Thom, Real sounds have a complexity that makes people feel like theyre really there. The more you have real sounds. The more real it will sound.

Randy Thom of Skywalker Sound used everything from a real steam train to running a violin bow over pieces of sheet metal to create sounds for The Polar Express. © 2004 Warner Bros. Ent. All rights reserved.

Randy Thom of Skywalker Sound used everything from a real steam train to running a violin bow over pieces of sheet metal to create sounds for The Polar Express. © 2004 Warner Bros. Ent. All rights reserved.

A large part of a sound designers creativity is the ability to layer audio to create new, yet believable effects. For the train in The Polar Express, Thom used everything from a real steam train to running a violin bow over big pieces of sheet metal to create metallic ringing. Because director Robert Zemeckis considered the train a central character in the film, Thom even slipped in breathing, hidden under the hiss of real stream. He made isolated, close-up recordings of all the specific train elements and loaded them into a piano-style keyboard, performing the train like a musical piece complete with tempo changes and full of chugs, steam and rail clacks.

Hacienda Post does three to four long-form projects a year, creating their sound design on features from scratch. They have their own foley and mixing stages, including theatrical, mid-size and smaller television stages. They work with ProTools HD-4 system and Focusrite: Control 24 mixing. In tandem with the director and producer of the project they create a pallet of sounds for the film. The design is dependant on the genre of the story such as action/adventure, sci-fi or comedy. On a long-form film, like those in the futuristic Bionicle series that was produced for Miramax, every single piece of foley, every vehicle, every weapon and every power possessed by each of the characters was original. They spent four to six months on each of these projects.

Television animation uses many of the same techniques as those in feature films, but is limited by increasingly smaller budgets and tighter schedules. According to Vitello, the basic process in television animation sound is to record dialog, assemble an audio track, adjust sync as the animation is supplied, cut together the sequences, call for retakes on the animation, cut the animation back into the show, cut to length for broadcast, lock picture, mix and deliver to spec.

Hacienda Post creates its sound design on features from scratch on their own foley and mixing stages (above). In tandem with the director and producer of the project, they create a pallet of sounds for the film.

Hacienda Post creates its sound design on features from scratch on their own foley and mixing stages (above). In tandem with the director and producer of the project, they create a pallet of sounds for the film.

Vitello has been working in animation audio for a good 20 years. His credits include Barbies first movie, Barbie in the Nutcracker, and strip shows such as Go-Bots, The Superpowers Team: Galactic Guardians, Defenders of the Earth, Captain Simian and The Space Monkeys, Iron Man, Voltron and the current Z-Force, an animation centered around 12 lead characters, ages 17-28, who represent the 12 signs of the Chinese Zodiac.

Although veteran cartoon voice actors are capable of doing multiple voices, for Z-Force Vitello wanted distinct voices for each of the characters, so he cast a combination of seasoned cartoon animation actors and live-action actors who had experience in animation. It was important for the actors to be artistic, with highly skilled technical ability in their craft, so that they could create vocal characterizations appropriate to the character.

After the initial storyboard is created for a show, the dialog is recorded. With the quick turnaround of most TV animation, Vitello feels best results are achieved when the actors match their performance to the action of the storyboards, rather than the animators matching the actors a technique often used in feature animation, when stars bring much of the physical characterization to the animation based on their vocal performances. Dialog is recorded into four channels of the ProTools. Each voice is recorded in isolation, but actors work in groups to create an ensemble for better performance.

Network series have a short turn-around, usually one show a week over the course of a 13-week schedule. To overcome the constraints of the short time schedules, a library of sounds, including some created for a particular show can be used in the course of editing the series. Cable shows allow a bit more time, two weeks for a half hour show. Having picture editor Janet Lime Leimenstoll in house, aids the smooth workflow at Vitello Prods.

Network series have a short turn-around. At Vitello Prods., in-house picture editor Janet Lime Leimenstoll helps to overcome the constraints of the time schedules.

Network series have a short turn-around. At Vitello Prods., in-house picture editor Janet Lime Leimenstoll helps to overcome the constraints of the time schedules.

Whereas the final mix on a feature animation may take several weeks. A television show mixes in a day. A direct to video feature will have a five to eight day mix, depending on the budget.

Service to the client is a high priority to all the audio artists and companies that work in animation sound. Working over a long period on an animated feature film, L.A. Studios gets to know the talent and what makes them comfortable. They take pride in providing a recording environment conducive to great vocal characterizations. Catering to stars includes knowing what snacks they like, their brand of soft drink or mineral water and low-key valet parking. For the added convenience of clients, they have facilities in Hollywood, Universal City and Santa Monica. Geoff Nathanson, manager of L.A. Studios Hollywood facility, comments, Were enablers. We work with the best to produce big time features.

Since the inception of Hacienda Post, Borquezs goal has been to create and maintain a highly creative and technically superior boutique operation, with a dedicated team of supervisors, editors and mixers on site, providing creative interaction at every step of the process, so that every client is assured that their projects receive the highest level of attention they deserve.

Vitello perhaps says it best when he concludes, The business has changed, but the craft is the same. We do quality work.

Mary Ann Skweres is a filmmaker and freelance writer. She has worked extensively in feature film and documentary post-production with credits as a picture editor and visual effects assistant. She is a member of the Motion Picture Editors Guild.

Tags