Pollen Music Group’s Alexis Harte, JJ Wiesler and Scot Stafford discuss the evolution of “No Wrong Way Home” and the challenges of composing music and sound design specifically for Virtual Reality.
Patrick Osborne’s short film PEARL, one of the latest installments in the Google Spotlight Stories series, delivers the Oscar-winning director of Disney’s FEAST and Ringling alum’s signature graphic style while breaking the boundaries of both storytelling and technology.
Set inside the beloved hatchback that serves as their home, PEARL follows a girl and her father as they crisscross the country chasing their musical dreams. The animated project -- which is available both as a 2D-animated short and as a virtual reality experience -- was unveiled in May at the Google I/O Conference, and is available now for viewing in Google Cardboard (or another equivalent) via the Spotlight Stories channel (for Android users) and the Google Spotlight Stories app (iOS).
While the 2D short has been making the rounds on the festival circuit, the 360-degree interactive version of the film has cropped up in only a few places: the 40th anniversary edition of the Ottawa International Animation Festival, the Tribeca Film Festival’s Interactive Playground, the VR Village at SIGGRAPH, and in a specially installed 360-degree Dome at Animaze.
The mobile storytelling project is a true R&D effort for Google’s ATAP group, employing the largest number of shots, sets and characters in any of the Spotlight Stories to date, with custom lighting, effects and interactive surround sound in every shot. Produced with Evil Eye Pictures, the short is anchored by the original song “No Wrong Way Home,” written by Pollen Music Group principals and composers Alexis Harte & JJ Wiesler and performed by Kelley Stoltz and Nicki Bluhm.
AWN had a chance to try first-hand the VR version of PEARL at SIGGRAPH and, despite being tethered to a Vive helmet, we were amazed by the level of attention given to each detail of the truly immersive experience. More recently, we sat down to speak with “No Wrong Way Home” songwriters Harte and Wiesler, along with Pollen principal and composer Scot Stafford, about the challenges of creating music and designing sound for VR, the director’s vision for all five versions of the project, and the back-and-forth of composing a song with five musical transitions overlaid on 38 separate scenes. You can read the (lightly edited) conversation below:
AWN: We’re huge fans of this project and have really enjoyed it in all its forms. For our readers, tell us about the team’s approach to creating the music and designing the sound for this very new kind of storytelling.
Scot Stafford: In our first meetings with Patrick Osborne, the director, we knew that it was going to take place in a car, and we knew it was going to be structured around a song, and we knew that at some point in the film characters might be singing the song -- actually performing it on screen -- or we might be hearing it like a narrative underscore to the film. So, right away we knew that it was going to be a challenge because, first of all, that’s hard to do in VR, but it also meant that we had to come up with a song before the characters were actually animated, because in many cases they were singing along to it.
What normally was a post-production process became a pre-production process, where we were actually developing a song that actually started with a pretty wide search. We looked at a lot of different songwriters from all over, and a lot of artists -- bands and musicians -- submitted sketches inspired by very early animatics. Then we had these blind listening sessions where Patrick came in and just listened to everything along with the producer, David Eisenmann, in our studio in Noe Valley in San Francisco. It gave Patrick an opportunity to hear all these different songs against pictures. You don’t really know what’s going to work until you see and hear at the same time. That’s how we settled on “No Wrong Way Home.”
“In PEARL, in particular, I’d never worked on something where sound design and music held hands so tightly throughout the whole show.” -- Scot Stafford
Then there was a lot of back-and-forth.... We had an idea of the arc of the story but we didn’t yet understand what the scenes were doing. We knew that the lyrics had to actually work throughout the arc of the story, and we knew that some of it would be sung by the dad, some by the daughter. Alexis wrote the “Hallelujah” version of the song, which is named after the Leonard Cohen song with its dozens and dozens of verses, only a few of which are performed at any given recording or performance.
In “No Wrong Way Home” there are 12 verses with lyrics from the perspective of the dad, some from the daughter, and some with more ambiguity where either could be singing. The meaning kind of changes, depending on what you’re seeing and who’s singing it. Then Patrick had an opportunity to create a story graph that basically gave us the arc of the whole story and what’s happening in each scene, and that allowed us to actually fully produce the song before we really saw any picture.
In PEARL, in particular, I’d never worked on something where sound design and music held hands so tightly throughout the whole show. Like I said, sometimes you’re just hearing a beautiful song, and sometimes you’re seeing it performed. The sound design has to spatialize around the viewer, and help tell the story so that at any given point you know that the person behind you is singing, you know that the person in front of you is playing guitar, you know that the song is coming out of those stereo speakers in the car right below you.
A lot of work went into just making it work and that was a partnership, a collaboration with our lead sound designer, Jamey Scott, who did just phenomenal sound design. We worked together on the mix to really make it sound like you were immersed in that world, because even if you can only see 20 percent of it at any given time, you can always hear 100 percent of it. It was a great creative challenge to create this mix of music and sound design that told the story that you couldn’t see.
AWN: Going back to what you said about how you’d never worked on something before where sound design and music held hands so tightly, did you have to develop the approach to sound design at the same time that you were composing the song?
JJ Wiesler: In the process of having to solve that problem of how intimately sound design and song and lyrics work together in VR, we relied on kind of every tool out there for spatialization. From object-oriented audio, which is similar to how games are done, where audio is sort of attached to an animated figure and as that figure moves though the space the audio moves with that person. We produced a full song in the studio, just like anybody would make a record, and we relied on that in scenes where we really wanted the music to sit on top of the scene, and act like the score and juxtapose against the scene. We also relied on ambisonic audio, which is a spherical audio medium that reacts to head tracking.
There are 38 scenes in the short, and for any given one we would deploy one or all of those techniques to create the soundscape. In some cases, they’re all going at once, where there’s objects and ambisonic sound design, and score, all kind of going at once. As you move your head the score will stay still and the sound design will move with the scene. Sometimes -- and this was one of Patrick’s brilliant ideas -- as the film evolves we rely much more on score and we let the diegetic theme-based audio play a lesser role. It’s more realistic at the beginning and more fantastic and cinematic towards the end. That sort of transition, I think, creates a lot of emotion.
AWN: When you say realistic at the beginning, do you mean realistic in terms of a song structure and how a song is normally presented? Or do you mean the opposite?
JJ: When I say realistic, I mean representing an immersive spatial world of sound, where there’s birds over here and there’s somebody talking over there...the sound design puts you in the scene. As it evolves it’s much more like you’re in a song, and you’re taken out of the scene and you’re allowed to sort of watch the scene rather than be in the scene.
Alexis Harte: For me, there are two reasons why the audio works so well: one is all of the technical, wonderful stuff that happens, and then the other thing is just having a song that works. Luckily, I get to focus my efforts on the part of this that I do well; I’m a songwriter, and I didn’t have to worry about any of the spatialization stuff. JJ and I built the song together; we started off like we usually do as a songwriting team: “Let’s write a great song and let’s make the song work with the picture, let’s make it work with the story; let’s give Patrick enough substance in the verses that he can pick and choose.”
There were clearly some verses that we wanted to be from the point of view of the father that would encompass the themes of parenthood and responsibility. Then there were clearly verses that were Sarah’s verses, you know, growing up in this way with all the anxiety, insecurities, but also fun and a sense of adventure. Then we also wanted some verses that could be a little bit more ambiguous, that could potentially be told from either point of view. The meaning of the verse could change depending on who was singing it. So there were three categories of verses that we presented to Patrick, which allowed him to decide, as he was telling the story, which perspective he was going to use based on the verses. He ended up using five of them in total.
AWN: Can you talk a little bit about some of the musical inspirations for the song? For instance, I’m reminded of Cat Stevens and some of the songs that he wrote that employed various voices.
AH: “Father and Son” is a classic, but it’s actually really clear, on at least one level, who’s singing what, and why they’re singing it. There’s also Jim Croce...Bob Dylan.... “Forever Young” was that sort of classic song. There are just so many great examples, and we talked about all of them.
The song “No Wrong Way Home” has a valid form with a repeating refrain. Then the verses set the story up in different ways, so that when you land up at that final verse, it hits hard. There’s definitely a songwriting mold that we followed, we didn’t reinvent the wheel here at all.
“One thing that is so great about this project is that it’s a simple, classic story with a simple, classic song. It uses a crazy amount of new technology, but you don’t need the technology to understand the song, or the story.” -- JJ Wiesler
SS: The story is obviously about a father and a daughter, but what happens with it is so moving. I love the way that Patrick starts with what really seems like a very ideal, fun way to grow up; you have this really cool-looking dad... Of course people were disturbed that no-one was wearing seatbelts but, hey, it was the 80’s, come on! The idea is that you see this adorable girl and this really hip dad just having this adventure together. Then the story progresses, and you start to understand their situation; all their belongings are in the car and you realize that they’re living in the car, and dad finally comes to this realization that it’s time to give up on his dream, and get a job and support his daughter. In her arc, in Sarah’s story, there’s this amazing pathos between the father and the daughter; she gets the car, she gets the guitar, and she gets the song and makes it her own. Really, that’s where this all came from.
Though the entire process I was very aware -- even while facing all of the new and really exciting technology and new format and new ways of experiencing stories -- that we can’t forget about just making it a great song that people will emotionally relate to. That’s why I really tried to let Alexis do his thing and worry about the whole process and kind of look into the future and say “What’s this going to be like? What will I have wished we had done differently?” Then time travel back to the present and try to do that.
JJ: There’s a timelessness to the story, and Patrick communicated that to us early on really clearly. That there’s going to be stuff for everybody to connect to in this. It’s just so wide. Just because the medium is futuristic, doesn’t mean the story can’t be classic. One thing that is so great about this project is that it’s a simple, classic story with a simple, classic song. It uses a crazy amount of new technology, but you don’t need the technology to understand the song, or the story.
That was a really wise decision by [Google Spotlight Stories creative director] Jan [Pinkava] and by Patrick and Scot and all other high-level planners. That sort of mantra of like, “Well, what’s the story?” You know, am I crying, am I moved? You can make the bird chirp over there on the left, but why is it there, and does it drive the storyboard? Those sort of big thinking minds really played a huge role in the efficacy of this little film in the end.
AH: It was a very conscious decision to try to make this as compelling as possible, even without the tech. And I think that’s borne out in the fact that the 2D version is taking on a life of its own, screening at festivals and the like.
JJ: It comes right back to your question about Cat Stevens; from a songwriting point of view, and from a production point of view, we were definitely looking into that era. We asked ourselves, “What are some of the things the dad would have been listening to? What would he have grown up on?” Cat Stevens, for sure, Dylan ... Just the timeless, lyric heavy, Bod Dylan, you know? Which we love; it’s definitely the era I grew up in, as well.
AWN: Somebody mentioned seat belts? Viewers really squawked about the lack of seat belts?
SS: You know, that’s the fun thing about this stuff, you put it out there on platforms like YouTube and you get this kind of instant reaction from individuals, and then you see those comments being voted up. It was amazing, some of the earliest voted up comments on YouTube were about like, “I can’t believe it! What a terrible father!” These are obviously people that didn’t grow up in the 70s and 80s and earlier, where we just were running around in the backs and in the fronts of cars. That was one of the surprising kind of head-scratching comments that we got early on.
AWN: Eek, the safety police -- Patrick just can’t win. With Winston, in FEAST, he got a lot of flack because Winston ate too much! As someone who grew up in a Ford Econoline, I was very touched by the story as a film, and I wondered how people who’ve grown up maybe a little more conventionally would relate to it; but it seems to resonate on a number of levels because everyone goes through that universal experience of handing off adulthood to the next generation.
SS: That’s right, everyone has been in the back of the car, and as you get older you get in the front of the car. If you have kids, there’s that experience of constantly checking on them in the back of the car. I’ve experienced it so many times where you look, you check your kids in the back of the car, and then a moment later you blink and you look back and they’re grown up.
“It was a very conscious decision to try to make this as compelling as possible, even without the tech. And I think that’s borne out in the fact that the 2D version is taking on a life of its own, screening at festivals and the like.” -- Alexis Harte
AWN: And they’re driving.
SS: It’s so evocative and universal, especially in countries where the road trip is part of the culture, it’s just something that everyone can relate to. Also it was great to be in a car for so many different reasons. One, is in VR you really don’t want to force people to move around. Being in a car allowed Patrick the ability to compose shots that you’re not supposed to be able to do in VR, because you can’t compose a shot because you don’t have the camera, your viewers do. You use the windows and the framing of the car to move people through a really interesting passage of time but still have a sense of choreography and shot composition.
AWN: So each shot is actually multiple shots?
SS: That’s right, and that’s another thing that you’re not supposed to do in VR. Patrick came in and said “Well, I’m gonna do it.” We do it 38 times, and the reason it worked I think is twofold. One, you’re in a car and that gives you a frame; even if the world is shifting around you and there are cuts, you never actually lose the position and orientation of being in that front seat of the car. The car gives you that stability and that’s that framework.
Another thing is the sound and music. One of the jobs of sound and music in film is to blur the lines that are created by cuts.
It was something that actually wasn’t working at all, until the very last minute. We were having trouble with audio visual sync and so in between each of these shots you would have this blip of like a second long where the next scene would load. It was just a kind of thing where we had to just trust that it was going to work. It was really unnerving because it was proving that actually you can’t do cuts. All the way up until about two weeks before Tribeca, we literally did not have the experience of watching and it working. Then when sound finally came in and the audio sync feature was introduced and you could finally smooth over the cuts it was like “Oh my God!” This yearlong gamble actually looked like it was going to pay off.
AWN: Okay, there’s FEAST again. Because right up to the very end they didn’t actually know if [lighting & rendering tool] Hyperion was going to work. Very edge-of-your-seat. It must have been very gratifying when it all did finally come together.
AH: As we began writing parts for specific singers, creating the arrangements, we also had a lot of discussions about how to show the song and its progressions through the generations as it’s given to Sarah and she takes it and makes it into a hit, it gets on the radio. We had to take a folk song and then turn it into something that would be on a Triple A station. We spent a lot of time talking about keys and how the song actually changes keys when she picks up the guitar.
We wanted it to feel like we were starting a new chapter with the song, so we focused on a more radio-produced type of song. We had to make the song work with the cuts, so there were audio cuts as well. The audio cuts were used to reset the timer, in a way, on the song, but we wanted to not have it be too jarring. We had to consider our key changes wisely, our tempo changes. There was a lot of talk that went in, like where will the tempo go, so that the song is still propelling the action along. It had to work with the cuts so we changed tempos quite a number of times. In a fairly subtle way, but that was just another piece to make it all kind of come together at the end.
-- READ: Patrick Osborne Lends His Talents to New VR Experience ‘PEARL’ only on AWN! --
AWN: Were those audio cuts synced up with the visual scene cuts? Or are these entirely separate elements?
AH: No, they had to be synced up with the visual sequences. It was kind of an iterative process -- we went back and forth with the animation team on selecting the points at which the song would change. We had five song sections that were built, in addition to the 38 scene cuts, so the song transitions had to be laid over the scene cuts. That was a big jigsaw puzzle.
JJ: The 38 cuts refers to the visual scenes; and the five audio cuts refers to musical segments of song where we naturally could come to sort of a conclusion and restart something. Which, from an interactive point of view, there is some waiting that can happen if the user is not looking in the right place. We can kind of wait for them to catch their breath. It just works really well from a song point of view, maybe the dad finishes a verse and then you wait a second and then he starts up the next verse. Certain points in the song itself, once you get going you can’t stop; audio is very unforgiving for blips.
AWN: How does this process differ from creating elements of a score for a feature project?
SS: What was so interesting about this project was that we actually got to do all of the above. There are really five versions of it. There is the interactive mobile 360 version, the non-interactive YouTube 360 version, a regular VR version for Google Cardboard, and then the so-called “big” VR version of it for Vive; and there’s the theatrical version.
We actually approached each version differently and in many cases it wasn’t just reassembling and re-implementing the sound for each or remixing them; we actually did recordings that were only meant for one and not the other. We did ambisonic recordings of each element. For every scene where you see the dad with his back to the camera outside the car, busking in the streets, we literally set up that scene and had Kelley Stoltz, the amazing singer who’s also a brilliant songwriter of his own, actually do that. We did recordings of him in the car with his back to us, facing us, doors open, doors closed, one window rolled down... For every scene we had to literally reconstruct it in 3D.
AWN: You iterated each section, according to how it would be affected by the physical environment?
SS: Exactly, and that played a big role in the VR version. Once we got to the dub stage, in Burbank, where we properly mixed the theatrical version, we didn’t use any of those recordings. In a theater you actually don’t want to feel like you’re in a car, so we relied primarily on studio recording. It’s actually very rare that you can really spend that much time on each version and make each what it’s really supposed to be. I guess I would say that the differences are that in VR, whether you want to or not, people are immersed and they’re embodied in the world. Whereas in theater you have a choice, and in most cases it’s not the case. People aren’t literally part of the story.
In VR you have to think about what happens as the characters go off to the left, off the screen; in film they’re gone. In VR you can follow them, so there’s no preconception, there’s no sense that “Oh, they’ve gone off to the screen left.” Well, they don’t leave the screen if you follow them, they continue, so you literally have to create a sonic soundscape that follows that wherever people want to go. Additionally, you have to allow for some interactivity. There are moments where the story will wait for you, whereas in every song for theater, for broadcast television, everything in 2D has a fixed time duration. You know, it’s one hour and 49 minutes, or, if it’s a short, it’s five minutes. Whereas this could be anywhere from five to seven and a half minutes depending on how you watch the show.
You see, you have to have music that’s there for you, that covers these moments where the story is waiting for you. So that when you look at a key story point it continues. It has to be seamless. You never want to really reveal the machinery beneath, you want to make it seem like “Oh, I looked over there and the story started back up.” Music, and sound, needs to play a lot of additional roles, allowing people to feel like they’re actually there, embodied, and can do what they want.