VFX supervisor Pablo Helman discusses the sophisticated digital de-aging process, sans markers, used to depict Robert De Niro, Joe Pesci and Al Pacino over a four-decade span.
De-aging actors has become a staple of the visual effects industry. But capturing their performances without facial markers presented a slew of technological challenges that VFX supervisor Pablo Helman (War of the Worlds) had to figure out on Martin Scorsese and Netflix’s epic biographical crime drama, The Irishman. Hellman seems to have figured it out and then some; he has subsequently received Oscar, BAFTA and VES Awards nominations for his efforts on the critically acclaimed film.
Based on the 2004 book, “I Heard You Paint Houses: Frank ‘The Irishman’ Sheeran & Closing the Case on Jimmy Hoffa” by Charles Brandt, the film chronicles, over several decades, Frank Sheeran’s career as an alleged hitman for the Bufalino crime family. In the film, lead actor Robert De Niro, 76, who had to appear as Sheeran at ages 24, 30, 36, 41, 50 and 65, with makeup taking over until he dies at age 83; Joe Pesci, 76, as mobster Russell Bufalino, whose age in the film ranged from 50 to 83; and Al Pacino, 79 as Teamsters union leader Jimmy Hoffa, whose ages ranged from 44 to 62.
1,700 of 2,300 movie’s shots had visual effects, with 1,000 involving de-aged faces. ILM was the main vendor on the project, while 2D makeup fixes and hand work were provided by SSVFX and Vitality VFX.
According to Helman, Scorsese is intrigued by the ability of technology to assist with storytelling, but not driven to understand all the intricate details. “Knowing Marty, understanding what he is after, and knowing what he reacts to and is not interested in helped me to expedite our communication,” reveals the VFX supervisor, who previously collaborated with the director on Silence. “Marty is curious about technology and is not afraid to push things to the limit.”
The film’s success completely depended upon the realism, in both look and mannerisms, of the de-aged actors, which required significant advances in CG technology to successfully capture onset performances without tracking dots or other actor-worn gear that Scorsese specifically would not allow. “At the time [in pre-production], everybody was talking about a cloud system,” Helman explains. “Meaning, from a mathematical triangulation point of view, you could create pixels out of light. But it wasn’t tried out and wasn’t something that was available for high resolution stuff. It’s like the same way that deep fake and artificial intelligence are at right now in that they’re right at their infancy of high resolution.”
The filmmakers decided a good starting point was capturing the footage with three cameras. So, they setup a test; Scorsese got De Niro to spend a day reshooting a scene as Jimmy Conway from Goodfellas, which Helman and ILM then spent 10 weeks working on to de-age the actor back to his 1990 version from the film. “We began by saying, ‘Even if we don’t know how this is going to turnout, capturing the most amount of information we can from the set will give us the best results,’” Helman says. “For the first test we did in 2015, the central camera was film because we were matching to Goodfellas and two RGBs were on the sides as witness cameras.” The final three-camera rig had a RED DRAGON with a Helium sensor in the centre and two infrared-modified ARRI ALEXA Minis acting as witness cameras placed on the left and right hand-sides. “The circumstances of getting soft light in a regular motion picture setting is non-existent because everything is either side or back lit,” he continues. “So, you’re going to get really hard shadows. If the software is looking for soft light, then the only way to do it is to flood the actor with infrared light because that illuminates the shadow without contributing to the lighting of DP Rodrigo Prieto [(Brokeback Mountain)].”
The software program Flux was used to capture the actors’ performances onset, under theatrical lighting, without markers; one consequence of depending on lighting and texture was that De Niro, Pacino and Pesci didn’t wear makeup. “We modified the software to compute only in the infrared spectrum [which the human eye cannot see],” Helman states. “If the camera can see the infrared spectrum then the shadows are gone.” De Niro, Pesci and Pacino took part in Medusa performance capture sessions to get the necessary geometry. “The Medusa sessions were done to capture all the things needed in order to create the models for the digital doubles,” Helman describes. “The three actors sat in front of eight cameras, which were placed at different angles, and performed a library of 100 poses. Because the Medusa system captures running footage, we were able to ask the actors to do a one minute piece where they went from a small to wide range of emotions; that was used along with the poses to not only build the contemporary digital double model but also to derive the younger model and a library of shapes that get called in when the solves happen.”
Scorsese’s character designs impacted the efficiency and accuracy of the performance. “Marty didn’t want to go back to Jimmy Conway in Goodfellas because he had done that already 30 years ago,” Helman notes. “Marty wanted to design a character that was consistent with Frank Sheeran as a younger person who has a wider face and is heavier. The most important thing for Marty and me was to have the performances exactly as they were onset, and by doing that we preserved all of the choices that the actors made such as how they manipulated their faces to communicate something; that’s what produces what we call behavioural likeness. The other thing that we did change was how the actor looked. We took wrinkles away and shrunk under the chin and the neck because a lot of real estate happens there as we get older. We did not change a blink in terms of the timing of things because Marty did not want to alter the performances. It’s always a challenge as you want the audience to recognize the actors but you really want them to get into the story and make a connection with the characters.”
Along with the faces, posture and body weight were also altered for the different ages. “We did it incrementally, meaning at the beginning of movie when they are younger, we did the greatest number of changes and then let them get older as the story goes on,” Helman shares. “Who is to say that someone 40 or 50 years-old moves in a specific way? We all move differently, have back problems or other kinds of generic problems, which is part of who we are; that was reflected in the design of the characters. Also, Marty saw them as damaged goods in the sense that they had a rough life and weren’t really athletic. This wasn’t going to be an action movie. It was basically going to be a movie about a bunch of people sitting in a restaurant planning to kill each other. We had long takes, especially with the delivery of the dialogue. It did make it harder because now you’re concentrating on subtleness. Ambiguity in terms of animation or performance is one of the most difficult things you can portray. It’s a very human thing.”
No keyframe animation was used on the film, as the facial performances were based entirely on 3D models and the library of facial poses captured during the Medusa sessions. “When we started looking at the results of the captures, we realized that they were high fidelity,” Helman remarks. “There was a lot of subtleness in there as well as numerous genetic connections that were made on an individual basis, like how we go from a smile to a concern. A keyframe animator or modeller would not know how those connections are made. In those cases, that’s how you lose the behavioural likeness.”
De Niro was the most challenging to de-age, according to Helman. “He had the most amount of shots and ages. De Niro is also a chameleon in that sometimes he shows up onset and doesn’t look like himself. But the capture is so faithful that we captured something that doesn’t look like him.”
Considering The Irishman takes place over four decades, some digital world building was needed. “There were digital set extensions and augmentation required because though the movie was shot in New York, there are a lot of scenes in Chicago and Detroit as well as New York from the 1960s and 1970s,” Helman states. “There were also long shots connected between two or three takes. Takes that would start on location in the middle of the street and end up inside a set or the other way around. We had about 200 locations. There was also a lot of set work in New York for interiors of the houses and restaurant.”
Previs was created for the murder in the barbershop scenes. “You start with the mobster being shaved and move to the bodyguard,” Helman says. “Then you follow the bodyguard right out of the barbershop into the Roosevelt Hotel. You see two mobsters going in and hear the shots. It’s about three minutes long. I did previs to find out from Marty how he wanted to transition from one scene to the other. He picked the most difficult solution, which was to stay with the bodyguard, so we had to morph between two different scene versions.”
Two years were spent compiling a library consisting of the entire cinematic careers of De Niro, Pesci and Pacino. “We had thousands of frames and created an artificial intelligence program that would take one of our renders and go through the library to pick out frames from all their movies that were like the ones we were rendering,” Helman notes. “It took into consideration lighting, position of the mouth and angles so we could get a sanity check that the likeness would be where it needed to be. That’s the first time we used artificial intelligence in a show to start giving us feedback on the work we were doing.”
One of Helman’s favourite cinematic moment occurs in film’s beginning. “De Niro has those scenes with Harvey Keitel in the Villa di Roma restaurant when Sheeran gets caught after blowing up the laundromat,” he says. “That’s a good scene because it’s basically about dialogue and seeing Frank Sheeran’s fear he might be killed for doing something he didn’t know anything about. Also, the scenes before that with Sheeran talking to Whispers DiTullio [Paul Herman], who is paying him $10,000, to blow-up the laundromat. When they meet in the diner, De Niro doesn’t do much but is always moving his chin. Even if you think that De Niro isn’t doing anything, he is doing a lot. The camera is so close to him that it can see everything that he’s doing.”
“The Irishman is definitely a character piece, not a visual effects show,” Helman concludes. “This was all about preserving the subtle performances.”