Capping its GTC announcements, the tech giant unveils new model that using only 2D images, generates 3D shapes - with high-fidelity textures and complex geometric details - to easily populate virtual worlds with a diverse array of 3D buildings, vehicles, and characters, in the most popular software app formats.
After days of whirlwind announcements and reveals about the future of the Omniverse, expanded tools, releases, updates, and new technologies likely to mold the future of the Multiverse, NVIDIA capped off GTC – the multiday AI conference - with the reveal of a new AI model from NVIDIA Research. The model illustrates how massive virtual worlds created by growing numbers of companies and creators can be more easily populated with a diverse array of 3D buildings, vehicles, and characters.
Trained using only 2D images, NVIDIA GET3D generates 3D shapes with high-fidelity textures and complex geometric details. These 3D objects are created in the same format used by popular graphics software applications, allowing users to immediately import their shapes into 3D renderers and game engines for further editing.
The generated objects can be used in 3D representations of buildings, outdoor spaces, or entire cities, designed for gaming, robotics, architecture, and social media industries. GET3D can generate a virtually unlimited number of 3D shapes based on the data it’s trained on. Like an artist who turns a lump of clay into a detailed sculpture, the model transforms numbers into complex 3D shapes.
For example, using a training dataset of 2D car images, GET3D creates a collection of sedans, trucks, race cars, and vans. When trained on animal images, it comes up with creatures such as foxes, rhinos, horses, and bears. Given chairs, the model generates assorted swivel chairs, dining chairs, and cozy recliners.
Watch “NVIDIA GET3D: AI Model to Populate Virtual Worlds with 3D Objects and Characters:”
“GET3D brings us a step closer to democratizing AI-powered 3D content creation,” said Sanja Fidler, vice president of AI research at NVIDIA, who leads the Toronto-based AI lab that created the tool. “Its ability to instantly generate textured 3D shapes could be a game-changer for developers, helping them rapidly populate virtual worlds with varied and interesting objects.”
GET3D is one of more than 20 NVIDIA-authored papers and workshops accepted to the NeurIPS AI conference, taking place in New Orleans and virtually, November 26-December 4.
Manually modeling a 3D virtual world that reflects real-world variety, such as streets lined with unique buildings, different vehicles whizzing by, and diverse crowds passing through, is incredibly time-consuming, making it difficult to fill out a detailed digital environment.
As explained by NVIDIA Research, though quicker than manual methods, prior 3D generative AI models were limited in the level of detail they could produce. Even recent inverse rendering methods can only generate 3D objects based on 2D images taken from various angles, requiring developers to build one 3D shape at a time.
GET3D can instead churn out some 20 shapes a second when running inference on a single NVIDIA GPU — working like a generative adversarial network for 2D images while generating 3D objects. The larger, more diverse the training dataset it’s learned from, the more varied and detailed the output.
NVIDIA researchers trained GET3D on synthetic data consisting of 2D images of 3D shapes captured from different camera angles. It took the team two days to train the model on around 1 million images using NVIDIA A100 Tensor Core GPUs.
GET3D gets its name from its ability to Generate Explicit Textured 3D meshes — meaning that the shapes it creates are in the form of a triangle mesh, like a paper-mâché model, covered with a textured material. Users can then easily import the objects into game engines, 3D modelers, and film renderers — and edit them.
Once creators export GET3D-generated shapes to a graphics application, they can apply realistic lighting effects as the object moves or rotates in a scene. By incorporating another AI tool from NVIDIA Research, StyleGAN-NADA, developers can use text prompts to add a specific style to an image, such as modifying a rendered car to become a burned car or a taxi or turning a regular house into a haunted one.
The researchers note that a future version of GET3D could use camera pose estimation techniques to allow developers to train the model on real-world data instead of synthetic datasets. It could also be improved to support universal generation — meaning developers could train GET3D on all kinds of 3D shapes at once, rather than needing to train it on one object category at a time.
To catch up on the latest news from NVIDIA AI research, watch the replay of NVIDIA founder and CEO Jensen Huang’s keynote address at GTC: