New Google Patents · Filed Apr 18, 2025 · Published Jun 4, 2026 · verified — real USPTO data

Google Patents an AI That Reconstructs 3D Objects From One Photo — and Admits When It's Guessing

Figuring out what a 3D object looks like from a single flat photo is genuinely hard — and most AI systems that try it won't tell you when they're guessing. Google's new patent tackles both problems at once.

Google Patent: 3D Shape Inference from 2D Images with NeRF — figure from US 2026/0154897 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0154897 A1
Applicant Google LLC
Filing date Apr 18, 2025
Publication date Jun 4, 2026
Inventors Benjamin Sang Lee, Matthew Douglas Hoffman, Tuan Anh Le, Pavel Sountsov, Ryan Michael Rifkin, Christopher Gordon Suter
CPC classification 345/426
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Feb 26, 2026)
Parent application is a National Stage Entry of PCTUS2023035603 (filed 2023-10-20)
Document 21 claims

What Google's uncertainty-aware 3D reconstruction actually does

Imagine taking one photo of a coffee mug and asking an AI to tell you exactly what the back of it looks like. That's a tough ask — there are many possible shapes that could produce the same photo, and a confident wrong answer is worse than an honest "I'm not sure."

Google's patent describes a system that runs this reconstruction process many times, each time making slightly different assumptions about what the hidden parts of the object might look like. The result isn't one single 3D model — it's a collection of plausible models, which together show you the range of possibilities the AI considered.

That built-in uncertainty is the key idea. Instead of handing you one confident-but-potentially-wrong answer, the system shows you where it's sure (the front of the mug, which it can see) and where it's guessing (the back, which it can't). That kind of honest output is far more useful for any downstream task that needs to act on the reconstruction.

How the hypernetwork and NeRF sampling loop work together

The system is built around Neural Radiance Fields (NeRF) — a technique that represents a 3D scene as a learned mathematical function rather than a mesh or point cloud. Given a 2D image, a NeRF model can render that scene from any angle. The challenge is inverting that process: starting from a photo and recovering the underlying 3D structure.

Google's approach adds a Bayesian-style uncertainty layer (Bayesian inference = making probabilistic guesses that get updated as evidence comes in). Rather than finding one best-fit NeRF, the system samples from a posterior distribution — a probability map over all the NeRF configurations that are consistent with the input image and what the model has learned about how objects generally look.

The sampling mechanism works through a hypernetwork — a secondary neural network whose job is to generate the weights for the primary NeRF model. Each iteration, a compact object code (a compressed description of one plausible version of the scene) is drawn from the distribution and fed through the hypernetwork, which outputs a full set of NeRF parameters. That NeRF then renders one sample image.

  • Run many iterations → get many sample images
  • High agreement across samples = high confidence
  • High disagreement across samples = genuine geometric ambiguity
  • The spread of outputs is the uncertainty estimate

What this means for AR, robotics, and 3D content tools

For fields like augmented reality, robotics, and 3D content creation, reconstructing an object's full shape from a handful of photos is a core bottleneck. Most current tools either require many carefully positioned photos or produce a single output without any signal about where the reconstruction might be wrong. A system that quantifies its own uncertainty lets downstream applications — a robot deciding whether to grasp an object, an AR app anchoring a virtual overlay — make smarter decisions when the geometry is ambiguous.

For Google specifically, this sits naturally alongside products like Google Maps 3D, ARCore, and Search's 3D object previews. A reconstruction pipeline that knows what it doesn't know is also far more useful for training data generation, where quietly wrong labels are a persistent problem.

Editorial take

This is serious computer vision research dressed up as a patent, and the core idea — using a hypernetwork to turn probabilistic object codes into NeRF weights — is genuinely clever. The uncertainty quantification angle elevates it above routine NeRF-from-single-image filings, because it's the kind of property that actually matters when you're deploying reconstruction in the real world. Worth watching.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.