Nvidia · Filed Jan 21, 2026 · Published Jun 4, 2026 · verified — real USPTO data

Nvidia Patents a Way to Spot 3D Objects Way Beyond Lidar's Reach

Lidar — the laser-based ranging tech that gives self-driving cars their 3D vision — has a hard distance limit. Nvidia thinks it can blow past that ceiling using nothing but a regular camera and a clever neural network.

Nvidia Patent: Long-Range 3D Object Detection Explained — figure from US 2026/0154975 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0154975 A1
Applicant NVIDIA Corporation
Filing date Jan 21, 2026
Publication date Jun 4, 2026
Inventors Zetong Yang, Zhiding Yu, Ren Hao Wang, Chris Choy, Anima Anandkumar, Jose M. Alvarez Lopez
CPC classification 382/156
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Feb 26, 2026)
Parent application is a Continuation of 18223473 (filed 2023-07-18)
Document 21 claims

How Nvidia sees farther than lidar using just a camera

Imagine a self-driving car's sensors are like your eyesight in fog: past a certain distance, everything goes fuzzy. Today's autonomous vehicles use lidar — a spinning laser system — to build a precise 3D map of nearby objects. But lidar can only measure so far, which leaves a dangerous blind spot on the highway when a stopped truck or merging car is still hundreds of meters away.

Nvidia's patent describes a system that takes over where lidar leaves off. Instead of needing laser data, it looks at a plain camera image, draws a 2D rectangle around an object (like a car or a pedestrian), and then uses a trained neural network to estimate how far away that object is — producing a full 3D position in space.

The trick is that the network has learned a kind of reverse geometry rule: given how big something looks in a 2D frame, it can infer its real-world depth. That means your car could theoretically "see" and place objects in 3D at distances no lidar has ever touched — which is exactly what long-highway autonomous driving needs.

How the neural net guesses depth from a flat 2D box

The core pipeline has two steps. First, a detector draws a standard 2D bounding box — a flat rectangle — around each object spotted in a camera image. That part is well-understood computer vision.

The novel piece is step two. The system feeds that 2D box into a neural network trained to run what the patent calls a learned inverse function (essentially, a depth estimator baked into the network's weights). In classical geometry, if you know an object's real size and its apparent size in a camera frame, you can calculate depth. This network generalizes that idea: it has learned, from large annotated datasets, how to reverse-engineer depth from visual cues like box height, aspect ratio, and position in the frame — even when the training data only had lidar coverage for close-range objects.

The output is a 3D bounding box — a full six-degree-of-freedom volume in space — defined by combining the 2D box geometry with the predicted depth. Key properties:

  • No lidar required at inference time — camera-only input
  • Works at distances beyond typical lidar range (~150–200 m)
  • The depth prediction is learned, not hand-engineered, so it adapts to different object categories
  • 3D box orientation and size are inferred jointly with depth

The approach sidesteps the fundamental lidar range ceiling by treating depth as something a vision model can learn rather than something a sensor must directly measure.

What this means for self-driving car safety at long range

For autonomous vehicles, long-range 3D awareness isn't a nice-to-have — it's a safety requirement. At highway speeds, a car needs to react to obstacles hundreds of meters away, well outside current lidar's comfortable operating envelope. Nvidia's approach could let a vehicle running purely on cameras (or a camera-lidar hybrid) maintain full 3D situational awareness at those distances without requiring increasingly expensive long-range lidar hardware.

Beyond self-driving, the same technique applies anywhere cameras must reason about depth in real time: drones, robotic warehouses, and traffic management systems. If Nvidia can make this work reliably, it also has direct implications for cost reduction — cheaper sensor stacks that still pass safety certification thresholds.

Editorial take

This is a genuinely useful piece of research from Nvidia's autonomous-vehicle team, and the problem it targets — the hard range limit of lidar — is one the industry has been wrestling with for years. The camera-only depth inference approach isn't new as a concept, but packaging it as a trainable inverse function tied directly to 2D bounding boxes is a clean architectural choice. Whether it holds up in edge-case weather and lighting conditions is the real test, but the patent's framing around safety-critical autonomous driving signals Nvidia is aiming this squarely at production, not just research.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.