Home Technology OpenAI releases Point-E, an AI that generates 3D models • businessroundups.org

OpenAI releases Point-E, an AI that generates 3D models • businessroundups.org

by Ana Lopez
0 comment

The next breakthrough that will take the AI ​​world by storm may be 3D model generators. This week, OpenAI has open sourced Point-E, a machine learning system that creates a 3D object using a text prompt. According to a paper published next to the code basecan produce Point-E 3D models in 1 to 2 minutes on a single Nvidia V100 GPU.

Point-E does not create 3D objects in the traditional sense. Rather, it generates point clouds, or discrete sets of data points in space that represent a 3D shape – hence the cheeky abbreviation. (The “E” in Point-E is short for “efficiency” because it is ostensibly faster than previous approaches to generating 3D objects.) Point clouds are easier to synthesize from a computational standpoint, but they don’t capture the fine-grained shape or texture – currently a major limitation of Point-E.

To get around this limitation, the Point-E team trained an additional AI system to convert Point-E’s point clouds into meshes. (Meshes — the collection of vertices, edges, and faces that define an object — are commonly used in 3D modeling and design.) But they note in the article that the model can sometimes miss certain parts of objects, resulting in blocky or distorted shapes.

Open AI Point-E

Image Credits: Open AI

Outside of the mesh-generating model, which stands alone, Point-E consists of two models: a text-to-image model and an image-to-3D model. The text-to-image model, similar to generative art systems such as OpenAI’s own DALL-E 2 and Stable Diffusion, was trained on labeled images to understand the associations between words and visual concepts. The image-to-3D model, on the other hand, got a series of images linked to 3D objects so that it learned to translate effectively between the two.

When given a text prompt – say “a 3D printable gear, a single gear 3 inches in diameter and 1/2 inch thick” – Point-E’s text-to-image model generates a synthetically rendered object that is entered in the image-to-3D model, which then generates a point cloud.

After training the models on a dataset of “several million” 3D objects and associated metadata, Point-E was able to produce colored point clouds that often corresponded to text prompts, the OpenAI researchers said. It’s not perfect – Point-E’s image-to-3D model sometimes doesn’t understand the image from the text-to-image model, resulting in a shape that doesn’t match the text prompt. Still, it is many times faster than the previous state-of-the-art – at least according to the OpenAI team.

Open AI Point-E

Convert the Point-E point clouds into meshes.

“While our method underperforms state-of-the-art techniques in this evaluation, it produces samples in a fraction of the time,” they wrote in the paper. “This could make it more practical for certain applications, or could enable the discovery of higher quality 3D objects.”

What are the applications exactly? Well, the OpenAI researchers point out that Point-E’s point clouds can be used to fabricate real-world objects, for example through 3D printing. With the additional mesh-converting model, once it’s a bit more polished, the system could also find its way into game and animation development workflows.

OpenAI may be the latest company to jump into the fray with the 3D object generator, but – as mentioned earlier – it’s certainly not the first. Earlier this year, Google released DreamFusion, an extended version of Dream Fields, a generative 3D system the company unveiled in 2021. Unlike Dream Fields, DreamFusion requires no prior training, which means it can generate 3D representations of objects without 3D data.

While all eyes are currently on 2D art generators, model-synthesizing AI could be the next big disruptor in the industry. 3D models are widely used in film and TV, interior design, architecture and various fields of science. For example, architectural firms use them to demonstrate proposed buildings and landscapes, while engineers use models as designs for new devices, vehicles, and structures.

Open AI Point-E

Point-E failure cases.

However, it usually takes a while to create 3D models – from a few hours to several days. AI like Point-E could change that if the kinks are worked out one day, and OpenAI make a respectable profit in the process.

The question is what kind of intellectual property disputes may arise in the long term. There is a large market for 3D models, with several online marketplaces, including CGStudio and CreativeMarket, where artists can sell content they’ve created. If Point-E catches on and its models make their way to market, model artists could protest, pointing to evidence that modern generative AI borrows heavily from its training data — existing 3D models, in Point-E’s case. Like DALL-E 2, Point-E does not mention or cite any of the artists who may have influenced its generations.

But OpenAI leaves that issue for another day. Neither the Point-E paper nor the GitHub page mentions copyright.

To their credit, the researchers doing mention that they expect Point-E to suffer other problems, such as biases inherited from the training data and a lack of safeguards around models that can be used to create ‘dangerous objects’. That may be why they so tentatively characterize Point-E as a “starting point” that they hope will inspire “further work” in text-to-3D synthesis.

You may also like

About Us

Latest Articles