Search form

'Sid the Science Kid': Henson Uses Mocap Smartly

Joe Strike dons his lab coat to discover the secrets behind the new PBS Kids series.

Sid represents a quantum leap forward for Henson puppeteering. All images ™ & © 2008 The Jim Henson Company.

One day in 1955, Jim Henson glued a pair of cut-in-half ping-pong balls onto the sleeve of a coat his mom had thrown away, slid it over his arm and gave birth to a frog named Kermit.

Animators and puppeteers have at least one thing in common -- they both use their hands to bring their characters to life. A traditional animator once classified (dismissed?) CGI as a form of puppetry. Perhaps he was onto something; perhaps it was inevitable that those two media would someday merge into a new, hybrid form of animation.

At Jim Henson Productions that hybrid is known as HDPS -- the Henson Digital Puppeteering System, and on Labor Day its first full-tilt creation, Sid the Science Kid, joined PBS Kids' lineup of educational shows.

When it comes to complex characters, it's SOP for puppeteers to work in tandem. It takes two people to make Sesame Street's Ernie talk and wave his arms around at the same time: one controlling his head and an arm while the other lends (ahem) a hand working the other arm.

Henson took it several steps further in movies like Labyrinth and the company's Dinosaurs TV series. In those efforts, actors encased in head-to-toe character costumes mimed out a performance while off-camera puppeteers provided not only their voices, but animatronic facial expressions and mouth movements via radio remote control.

In Sid the concept takes a quantum leap forward and into the realm of animation. Start with the mind-meld of two puppeteers acting as one... give the voice/face puppeteer computer-based controls... replace the character costume with a motion capture rig... then have the mocap performers interacting with each other from across a mocap stage, pretending they are side by side in the scene being shot (they can't be next to each other because half of them will be scaled down to child-size).

If it sounds complicated, it is; impossible, however, it's not. Henson's system is pumping out animation in real time -- and in multi-camera perspective, complete with camera moves. It's the literal equivalent of shooting a live, on-stage sitcom that happens to star cartoon characters.

On the mocap stage, PVC piping stands in for the CGI

Katy Garretson has directed Frasier s, Freddie s and George Lopez es. She's a veteran of multi-camera production, but had never worked in animation or on kid shows until Sid. With six episodes of the show's 40 under her belt, Katy's now an old pro at this new technology.

"They told me they wanted to try a live-action person, they wanted a sitcom style. The show is about kids, families, friends and school -- they wanted someone familiar with working with actors. The script they gave me looks like a regular sitcom script, but the difference is when you walk onto the stage. There's no sets, no dressing rooms, no cameras. They were there, but in different forms."

Mocap took place on a 40x60 Hollywood stage, its rubber floor mapped out with PVC piping indicating where the CGI furniture for each "set" is located. Actually, the floor is mapped out twice, the second time at a larger scale -- the "kids" side of the studio. "We had all these different worlds, different realities within this rubber matte. When dad puts his hand on his son's shoulder, they're actually 30 feet apart. There's also a performer who plays a baby, so there was a third reality in the corner we had to shrink down further."

Katy goes on to describe a virtual studio where the prop master sits at a computer and "punches" props into the scene and "camera" operators work at joystick stations. "It's kind of like a video game. Their virtual cameras are like mini techno-cranes. They can do 360-degree moves without getting in each other's way. They can go above or below the set and get any shot you want.

"I was able to flex my camera creative muscles a little more on this show."

Where a CGI feature typically takes 18 months to produce 90 minutes of animation, the use of mocap allows Henson to produce 20 hours in one year.

Kerry Shea is the Henson Creature Shop's head of digital production. For the last two and a half years she and the Henson people have been gearing up HDPS to do Sid, a humor-driven concept (Sid opens and closes each episode with a mini stand-up routine à la Seinfeld) that originated in-house. Components of the system had been given a run-through on a series of webisodes called Skrumps, produced for Yahoo! Kids, and on Frances, a PBS KIDS Sprout series.

"HDPS isn't new technology, just much improved," says Kerry. "We were able to put together many pieces to build it up. What we have on Sid are five [puppet] rigs and five body performers working simultaneously in a CG environment built in Maya. Four different camera views are being recorded in real time, plus a switcher feed, just like in a live-action sitcom.

"In two and a half days you can do a virtual shoot of 22 minutes of animation. If you don't like what you see, you can do another take on the set, or change the camera angle. All that information is in the editorial bay the next day. The efficiency is incredible."

A lot of puzzle pieces had to be put together to achieve that efficiency. Steffen M. Wild comes from the world of effects-driven feature films, and his specialty is constructing production pipelines for complex projects; as Sid the Science Kid's digital effects supervisor, Steffen had to oversee the entire digital workflow. With one week to go before the show's premiere, twelve episodes had been delivered to PBS, while the remaining 28 neared completion. It might seem like a tight squeeze to an outsider, but "it was scheduled like that," Steffen says. "This was Henson's first high-definition TV project. When we started, we took a serious look at disc storage, networking, our render farm -- all those aspects. The HD image is at least four times the size of standard definition. It was an added challenge for the company to take that next step.

"In terms of render capacity and things that needed to be in place, our developers were working constantly to create software to optimize the workflow. Now we're seeing the same exponential curve as in feature film production: a lot of ramp-up time while we looked for optimal ways to render and code was being written. All those things are now taking hold and we're seeing the fruits of that.

"In a feature film, you're creating 90 minutes of CGI animation in 18 months; here we're doing 20 hours in one year. We have to be fast and furious. People in the CG industry are used to it, and it's also a creative challenge. If you have all the time in the world, your mind keeps wandering, you keep creating new tools and it's never good enough. But with tight deadlines, you have to ask yourself, how can I achieve the best results with what I have in a given time? You come up with all kinds of interesting things."

One solution right from the start was to stick with as much out-of-the-box software as possible. "We used Maya 2008 to build our models, and Photoshop for texturing," says Steffen. "Two very standard packages." Another shortcut was limiting the settings to eleven different environments and the cast to nine characters.

It takes two performers -- one for the body and one for the face and voice -- to bring each character to life.

Garretson quickly discovered her previous directing experience didn't prepare her for working on a virtual set. "On a regular sitcom, I'd get a script on day one of five. We'd have a couple of rehearsal days where I could see my cameras, try different things and give direction to the performers.

"Everything was different on Sid. I'd get a script a month in advance. I had to plan the show in my mind -- there was no advance rehearsal, no staging or blocking. We had a series of meetings before shoot day where everything was laid out: is the milk carton in the shot, how does Sid hold the spoon?

"Then I had to give direction to two people for every character as though they were one. You don't forget to do that more than once or twice, because if you tell the body performer to go one way and you forget to tell the face/voice person what you're doing -- it's kind of funny to watch," she laughs. "But I was floored by how in sync they all were -- how Drew [Massey], doing Sid's voice and face, and Misty [Rosas], doing Sid's body, and not even in visual contact with Drew, how their reactions would sync up as if they were inside each other's brain."

The typical mocap performer is usually dressed in a sensor-studded body stocking, but Sid required a different outfit. Behind-the-scenes footage shows performers wearing what look like a truck tire around their waists and garbage pail-sized footwear. "We call them 'outriggers,'" Steffen explains. "In our research we found that it's better if the puppeteer wears a belly outrigger with a certain heaviness to it" to match the child characters' lower center of gravity. "He or she would perform differently in that, rather than pretending to have a heavy belly."

The piece of hardware rendering the animation in real time (in what Kerry describes as "mid-rez" quality) is a high-powered game engine dubbed "The Viewer." Large, on-stage monitors display the animation so that performers and director can decide on the spot if they're capturing the best performance and camera moves. The video from the four "cameras" (plus a line cut from the shoot) is delivered to editorial where it is loaded into Final Cut Pro.

With four different camera views recorded in real time, plus a switcher feed, Sid is shot just like a live-action sitcom.

After the episode is fine-tuned and the cut is locked in place, the motion capture data is shipped to a company known as Motion Analysis for "clean up."

"Occasionally some of the mocap markers get occluded -- they're not visible to all the cameras," says Steffen. "We contracted Motion Analysis to put the data through their proprietary software, which makes all the markers visible all the time. It eliminates high-frequency noise and the jittering you usually get with out-of-the-box mocap systems. They sell their system and we have one, but we also entered into a partnership with them to get a clean data set back."

Once back at Henson, the cleaned-up data is imported into a Maya scene file to generate the higher-resolution character animation for the show proper. The shots are actually rendered three times over in what Kerry says is standard for CG animation: a color pass, a matte pass and a monochrome shadow pass to give the animation a full dimensionality. Apple's Shake compositor merges the three passes into the final image, but a different compositor -- Scratch from Assimilate -- comes into play earlier in post to speed up the workflow. According to Steffen, it's used for color correction in the equivalent of the digital intermediate stage, between the CGI and final editorial. "It's not the application it was designed for, but it does the job very, very efficiently.

"Usually in CGI films, production works upwards from shots into sequences and then into the finished project. At Henson we've adopted the live-action paradigm. We shoot on stage, shooting take by take. Only at the very end in editorial do we cut it down to shots. When we have to address directorial notes in editorial and go back upstream on a take level, it becomes painful at that stage because it's already cut down.

"But with Scratch I can replicate my editorial timeline with all the mattes I need and address requested changes -- a more blurred environment, more sun, a brighter exterior, more shadows, etc. -- right there and pump it back into Final Cut without going all the way back to takes. We're not compromising image quality, but the tool allows us to work better, in a faster manner and achieve our goal."

As in any project charting new technological ground, unexpected glitches can pop up at any point in the production. One in particular put a serious crimp in rendering time until it was solved. "Most of the time it took us two minutes to render a frame in HD, but for some reason certain frames took two hours," Steffen reveals. "We had to dig deeper to find out why.

"It wasn't easy, but we finally discovered the topology of the environment was causing the problem. When the camera saw certain aspects of the far background, suddenly render time went through the roof. Our TDs spent three weeks on the problem, and we finally found out it was the anticipated behavior of the Maya software.

"We learned how to work with it. For those shots we created a special background pass" -- created from texture-treated 2D planes -- "that we rendered as a separate element. We got the render time for those shots down to about four minutes -- twice the number we were usually getting, but still much better than two hours."

Sid's tech-savvy mom probably understands the challenges of bringing a high-quality CG series to TV.

Solutions to those kinds of puzzles become part of Henson's arsenal of production tools. "The way we operate is to always be very self-critical," Steffen adds. "If a step works, that's great, but we immediately circle back to look at how we can improve things for next season or the next project -- or how a tool written specifically for Sid can be made more standardized for applications down the line." (Tools developed in-house for Maya are written either in MEL or Python, the two programming languages Maya currently supports.)

"Our ultimate goal over the next few years is to reduce the number of hours we spend in post," Steffen sums up. "We're focusing all our development efforts on getting better data straight out of the real-time environment. That's why we're not investing a lot of resources in the post process; we're trying to keep that streamlined, to get as much mileage as possible with packages like Maya.

"We don't want a proprietary back-end render pipeline -- we're going for a proprietary real-time render pipeline. Our dream is to build characters and environments, then go onstage, shoot it and be done, the same as in live-action with only one editorial step afterwards. That's the vision. I'm sure over the next couple of years we'll be able to accomplish that."

Joe Strike is a regular contributor to AWN. His animation articles also appear in the NY Daily News and the New York Press.

Joe Strike's picture

Joe Strike has written about animation for numerous publications. He is the author of Furry Nation: The True Story of America's Most Misunderstood Subculture.