Coinciding with SIGGRAPH, NVIDIA  has launched its newest GPUs (the Quadro 4000, 5000 and 6000), based on its latest Fermi architecture, offering the long sought after "computational visualization."
"This is the holy grail," exclaimed Dan Vivoli, SVP of NVIDIA, at a recent Fermi press briefing at the company's headquarters in Santa Clara, CA. "This is the culmination of a 10-year plan."
Indeed, Fermi offers 5x the design complexity of the previous architecture and up to 8x the simulation performance. This not only solves computational difficulty but also graphic complexity, resulting in more detailed modeling and faster performance.
The 4000 features 2 GB of frame buffer memory and 356 CUDA parallel processing cores; the 5000 has 2.5 GB and 352 cores; and the 6000 has 6 GB and 448 cores.
"The speed up is truly transformative for our customers," added Jeff Brown, general manager, Professional Solutions Group, NVIDIA, "giving them interactive insight and dramatically enhancing their creative process in ways that have not been possible on individual workstations before."
Meanwhile, Quadro 6000 delivers 1.3 billion triangles per second, shattering previous 3D performance benchmarks, with gains up to 8x faster when running computationally intensive applications such as ray tracing, video processing and computational fluid dynamics. In addition, all three graphics cards enable advanced capabilities for stereoscopic 3-D, scalable visualization and high-definition 3-D broadcasting.
For starters, imagine what a difference Fermi will make with Plume, Industrial Light & Magic's new fluid simulation system and GPU-based renderer built around CUDA, introduced on The Last Airbender . Plume used a 12-machine GPU-based render farm powered by the Quadro FX 5800 graphics cards.
"We fully expect the Quadro 6000 card to increase the performance of tools like Plume by several orders of magnitude," suggested Dominick Spina, NVIDIA Sr. Product Manager, Vertical Marketing Group. "Not only has the new Fermi architecture been design to address problems that are both computationally difficult and graphically intensive, but the increased frame buffer size will be allowed for the handling of massive geometry sets.
ILM opted to write its GPU-accelerated fluid solver using NVIDIA CUDA rather than Open GL because it simplified the development process.
"Every new generation Quadro is designed to leverage the latest features of CUDA," Spina continues. "We are now working with studios like ILM to integrate their production data sets into our quality performance test suites. This will now provide a more reliable method of regression testing and reduces the risk of making pipeline chances during productions.
"The Fermi's tight coupling of computation and visualization processes makes achieving final renders on the GPU in one step a reality. No longer does the design need to incorporate a pre-computational step and an additional render process, CUDA can provide the framework to compute and render on the GPU within a single process reducing latency. This opens possibilities for implementing similar techniques in other physical simulations such as RBD and water."
In fact, ILM plans to incorporate additional NVIDIA CUDA-based tools into future project pipelines and continues to explore new ways to implement Quadro GPU-accelerated rendering into its vfx workflows.
Meanwhile, Mari, the 3D texture painting package developed at Weta Digital for Avatar  is GPU-based and was developed with Quadro hardware, providing up 700 4K textures. With quick turnaround, you can paint creatures in realtime, in motion, with no need for a render farm. Mari was also used on The Lovely Bones ).
As Senior R&D Engineer at Weta, Jack Greasley led the team that developed Mari. He has taken Mari to The Foundry, where he serves as product manager for the system. "Running Mari on Fermi Quadros immediately gave us a dramatic speed boost," boasts Greasley. "Having access to more texture units, more GPU RAM and larger texture sizes will allow us to better support our user's creativity. Freeing people from the constraints of hardware means they can concentrate on their art and not arbitrary technical limitations."
Internally, The Foundry developed Blink, a multi-device image processing framework, which allows the same software to run on CPUs and CUDA-based GPUs. The Foundry decided to tackle one of its hardest algorithms first: the motion estimation-based retimer, Kronos. On the Quadro 5000, The Foundry can compute a 10:1 slowdown on SD footage at the peak rate of about 200 frames-per-second.
Ray tracing, too, will be much faster and more efficient, thanks to Fermi architecture, enabling software developers to create high performance solutions in their choice of computing languages. NVIDIA also assists developers to quickly take advantage of the GPU by providing the NVIDIA OptiX ray tracing engine for accelerating custom solutions, and iray from mental images, for a complete, world-class renderer.
"Our photorealistic iray rendering solution demonstrates the massive speedup delivered by NVIDIA's new Fermi GPU architecture," said Rolf Herken, CEO & CTO of mental images. "iray delivers dramatically higher performance on a single Quadro GPU than on a quad-core CPU. In addition, iray can leverage multiple GPUs in a single workstation or across a cluster of machines for ultimate performance scalability."
Bill Desowitz is senior editor of AWN & VFXWorld.