Ghostbusters Challenges: Game Loop Parallelization in the Infernal Engine
In the old days of single-processor computers, your game loop would run every process for the game in a single step with the results being 100-percent deterministic. Your game loop looked much like the following:
- Run the tick code for every actor
- Perform rigid body simulation
- Process particle effects
- Figure out what is visible
- Render your set
- Render your actors
- Render your particle effects
- Show the frame
With the advent of multiprocessor computers, game programming has become a lot more complicated. Given a 3 GHz quad core and a fast video card, Ghostbusters will be able to keep all four cores 100-percent utilized in heavy action. During the development of that game, which is based upon the movie franchise, we were able to accomplish this feat.
When we started on the next-generation systems four years ago, we took a good look at the PS3, Xbox 360 and PC platforms. We used the PS3 “job” model as the basis for our multithreading model for all systems. The PS3 has one general-purpose processor, which we used for our game loop and for jobs that could run on the SPUs. Since the PC and 360 do not have SPUs, we created as many extra job threads as CPUs in the systems. Each job queue thread (whether running on the SPU on the PS3 or the PC) would sit in a suspended state and be woken up only if there was a job ready to process. Once the job was processed, it would check for another job to grab. If there was another job ready, it would start; otherwise the thread would go back to sleep.
Our new parallel game loop looks like the following:
- Lock our physics simulation
- Update each actor’s position from physics simulation, queue up animation jobs, run tick code on each actor
- Unlock the physics simulation
- Kick off physics simulation
- Process particle effects
- Queue up visible objects in a display list
- Kick off display list rendering job
When we queue up the display list, it contains the full state of what needs to be rendered -- without relying on any game data. This requires copying data into the display list, such as the animation state of an actor, or data for a particle effect. Actor states need to be able to change while we are rendering the previous frame’s data. If there were multiple rendering passes, the display list data could be reused for those passes rather than entering them multiple times.
The Infernal Engine also had the distinct advantage for actor simulation -- each actor was physically simulated as rigid body or constrained system of bodies, so the collision and movement would happen inside the physics engine. To guarantee order of operations, especially for the AI, we still tick each actor in serial, but most of the actual work happens as jobs now.
Our VELOCITY Physics Engine was also rewritten to be massively parallel and run solely in the job queue. The results of having a massively parallel game engine were stunning. When we finally got rendering and simulation of the game in parallel in the last weeks of Ghostbusters, the game became solely render-bound. Jobs were totally asynchronous, and we were able to fully utilize three to four cores.
Mark Randel is the president and chief technology officer of Terminal Reality Inc., developer of Ghostbusters: The Video Game.