Nvidia · Filed Feb 16, 2026 · Published Jun 25, 2026 · verified — real USPTO data

Nvidia Patents Software That Builds 3D Shapes From Flat Photos

Nvidia is patenting a system that lets a neural network look at ordinary flat images and reconstruct the three-dimensional shape of objects in them, a capability that sits at the core of robotics, autonomous vehicles, and spatial computing.

Nvidia Patent: Neural Network 2D-to-3D Point Cloud Generation — figure from US 2026/0179247 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0179247 A1
Applicant NVIDIA Corporation
Filing date Feb 16, 2026
Publication date Jun 25, 2026
Inventors Maria Shugrina, Luca Moschella, Sanja Fidler
CPC classification 382/155
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Mar 22, 2026)
Parent application is a Continuation of 18115582 (filed 2023-02-28)
Document 20 claims

How Nvidia turns regular photos into 3D point clouds

Imagine trying to describe the exact shape of a coffee mug to a friend using only photographs. You'd need pictures from multiple angles, and even then the friend would be guessing at depth. Computers face the same problem, and it gets expensive fast when you need 3D data at scale.

Nvidia's patent describes a system that uses a neural network (a type of AI trained on examples) to take one or more regular 2D images and produce a 3D point cloud, which is essentially a cluster of dots in space that together define the shape of an object, like a connect-the-dots model hovering in midair.

The clever part is that the system feeds both the 2D images and an existing point cloud back into the network at the same time, letting the AI refine its guesses using what it already knows about the shape. The result is a processor-level pipeline that could help machines understand their physical environment without needing expensive dedicated 3D sensors for everything.

How the neural network loops 2D images into 3D output

The patent describes a processor with dedicated circuits that run one or more neural networks to generate 3D point clouds of objects. A point cloud is a set of (x, y, z) coordinate positions that together represent the surface or volume of a real-world object, think of it as a digital sculpture made of thousands of floating dots.

The key input pairing is what makes this approach distinct. The network takes in 2D images (standard photographs or camera frames) alongside an existing 3D point cloud, then uses both sources together to produce an improved or newly generated point cloud. This feedback loop, using a partial or prior 3D representation alongside flat image data, lets the model cross-check spatial guesses against actual pixel evidence.

  • Input: one or more 2D images of an object
  • Secondary input: one or more existing 3D point clouds (could be sparse or incomplete)
  • Output: a refined or newly generated 3D point cloud

The claim is written at the processor level, meaning Nvidia is protecting the hardware-plus-software combination, not just a software algorithm. This positions the invention close to Nvidia's GPU and embedded processor product lines.

What this means for robotics, self-driving, and 3D AI

The ability to reliably lift 3D geometry out of 2D camera footage is one of the central problems in autonomous vehicles, warehouse robotics, and augmented reality. Lidar sensors that do this directly are accurate but expensive; using cameras is cheaper, but traditional methods struggle with depth. A neural network that closes that gap by combining image data with existing 3D references could make high-quality spatial perception available in cost-sensitive hardware.

For you as a consumer, this kind of technology is the underpinning of features like obstacle detection in self-driving systems, the 3D object recognition in robot arms, and even the spatial mapping your phone does in AR apps. Nvidia sits at the chip level for most of these applications, so a patent here reinforces its position as the processor supplier for AI-driven spatial computing.

Editorial take

This is a foundational computer-vision patent rather than a flashy product announcement, but foundational is where Nvidia consistently wins. Protecting the processor-level implementation of 2D-to-3D reconstruction puts Nvidia in a strong position to tie this capability to its GPU and Jetson embedded platforms, exactly where autonomous systems are being built right now.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.