Nvidia · Filed Feb 19, 2026 · Published Jun 25, 2026 · verified — real USPTO data

Nvidia Patents a System That Builds 3D Maps of Objects From Flat Sensor Readings

By Patentlyze Team · Updated Jun 26, 2026

Most cameras only see the world in flat, 2D images. Nvidia's new patent describes a way for a machine to figure out where an object sits in full 3D space, without needing expensive extra sensors to do it.

FIG. 1A — rendered from the official USPTO publication PDF.

Publication number US 2026/0179318 A1

Applicant NVIDIA Corporation

Filing date Feb 19, 2026

Publication date Jun 25, 2026

Inventors Abhishek Bajpayee, Sai Krishnan Chandrasekar, Xudong Chen, Hae Jong Seo, Siddharth Kothiyal

CPC classification 345/419

Grant likelihood Medium

Examiner CENTRAL, DOCKET (Art Unit OPAP)

Status Docketed New Case - Ready for Examination (Mar 22, 2026)

Parent application is a Continuation of 18351917 (filed 2023-07-13)

Document 20 claims

AI/ML

What Nvidia's 3D object detection system actually does

Imagine a self-driving car that spots a pedestrian ahead. The camera feeds back a flat picture, like a photograph, but what the car's computer really needs to know is: how far away is that person, how tall are they, and exactly where are they in three dimensions? Getting that wrong, even slightly, could mean the difference between stopping safely and not.

Nvidia's patent covers a system that takes the kind of flat, 2D visual information a camera naturally produces, combines it with depth data and labels that identify what the object is (a person, a cone, a car), and assembles all of that into a full 3D representation of the detected object. The machine then acts on that 3D picture rather than the raw flat image.

The system runs on specialized chips (Nvidia's own SoCs, which bundle a CPU, GPU, and dedicated accelerators together) and connects directly to whatever sensors the machine is using. It's designed for any machine that needs to understand its physical surroundings in real time.

How 2D data becomes a full 3D landmark representation

The patent describes a machine equipped with one or more systems-on-a-chip (SoCs), single chips that combine a general-purpose processor, a graphics processor, and specialized hardware accelerators all in one package. Those chips process data from onboard sensors (cameras, lidar, radar, or similar) that are pointed outward at the machine's environment.

When the system detects a landmark or object, it pulls together three distinct pieces of information:

2D location information, where the object appears in the flat camera image, essentially its pixel coordinates
Semantic classifier information, a label that identifies what type of object it is (pedestrian, vehicle, traffic sign, etc.), produced by a neural network trained to categorize things it sees
Depth information, how far away the object actually is from the sensor, which can come from a depth camera, lidar, or estimated computationally

Those three inputs are fused together to produce a 3D representation of the detected object, including its predicted size, orientation, and position in real-world space. The machine then uses that 3D model to decide what to do next, whether that's navigating around an obstacle or tracking a moving target.

The claim is broad: it covers any machine performing operations based on a 3D landmark representation derived this way, which could include autonomous vehicles, drones, or industrial robots.

What this means for self-driving and autonomous robots

For autonomous vehicles and robots, knowing the precise 3D shape and position of nearby objects is the core safety problem. A flat image tells you a car is in front of you; a 3D model tells you it's 12 meters away, angled 15 degrees, and partially blocking the lane. That difference is what separates a near-miss from an accurate response.

Nvidia already supplies the Drive computing platform used by many automakers for autonomous-driving development. A patent that covers the fundamental process of converting 2D sensor feeds into actionable 3D object data sits right at the center of that business. If granted broadly, it could give Nvidia a meaningful position in how perception software is built across the industry.

Editorial take

This is a foundational perception patent, not a flashy user-facing feature, but foundational is exactly where Nvidia wants to plant its flag in autonomous systems. The claim language is intentionally wide, covering any machine that does 3D object reasoning from 2D inputs plus depth plus semantic labels, which describes nearly every modern autonomous-vehicle perception stack. Whether it survives prior-art scrutiny at that breadth is the real question.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.

Nvidia Patents a System That Builds 3D Maps of Objects From Flat Sensor Readings

What Nvidia's 3D object detection system actually does

How 2D data becomes a full 3D landmark representation

What this means for self-driving and autonomous robots

More from Nvidia

More in AI/ML

Get one Big Tech patent every Sunday