Google · Filed Jan 16, 2026 · Published May 21, 2026 · verified — real USPTO data

Google Patents a Flow-Model Approach to Train NeRFs on Sparse Photos

Neural Radiance Fields are extraordinary at reconstructing 3D scenes — but they usually demand dozens or hundreds of photos to work well. Google's new patent targets the hard case: what happens when you only have a handful of images?

Google Patent: NeRF Training With Sparse Image Data — figure from US 2026/0141626 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0141626 A1
Applicant Google LLC
Filing date Jan 16, 2026
Publication date May 21, 2026
Inventors Noha Radwan, Jonathan Tilton Barron, Benjamin Joseph Mildenhall, Seyed Mohammad Mehdi Sajjadi, Michael Niemeyer
CPC classification 382/156
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Feb 17, 2026)
Parent application is a Continuation of 18012270 (filed 2022-12-22)
Document 20 claims

What Google's sparse-data NeRF training actually does

Imagine you want to build a photorealistic 3D model of a room, but you only have two or three photos of it taken from different angles. Most 3D reconstruction tools would either fail outright or produce a blurry, artifact-ridden mess because they don't have enough information to fill in the gaps.

Google's approach here trains a Neural Radiance Field (NeRF) — a type of AI model that learns to synthesize new camera angles of a scene — by giving it a second AI system as a quality checker. That second system, called a normalizing flow model, has been pre-trained on thousands of real images and has developed a strong sense of what natural colors, edges, and textures look like. When the NeRF produces a rendered view, the flow model scores how realistic it looks and feeds that score back as a training signal.

The NeRF also gets graded against small cropped regions — patches — of the original ground truth photos, rather than just individual pixels. That patch-level comparison forces the model to get local detail right, not just global brightness, making the final 3D output noticeably cleaner even when source images are scarce.

How the flow model penalizes NeRF rendering artifacts

The patent describes a two-stage training pipeline designed to make NeRF models robust when input imagery is sparse.

Stage one: pre-training the flow model. A normalizing flow model (a generative model that learns the statistical distribution of real images — essentially a learned sense of what a plausible image looks like) is trained on a large, diverse dataset of images covering many scenes. The goal is for this model to internalize natural geometry, color transitions, and texture statistics, so it can later act as a realism referee.

Stage two: NeRF training with dual supervision. When training each NeRF on a new scene, the system does two things simultaneously:

  • It crops ground truth patches from the available real photos and compares them directly to corresponding patches rendered by the NeRF — a patch-level loss that enforces local structural fidelity.
  • It passes each NeRF-rendered view through the pre-trained flow model, which outputs a likelihood score (how probable is this image under the learned distribution of real images?). Low-likelihood outputs — meaning the NeRF produced something unnaturally blotchy or geometrically inconsistent — result in a higher training penalty.

By combining these two signals, the NeRF is pushed toward outputs that are both faithful to the sparse input photos and statistically consistent with how real scenes look. The patch database referenced in the filing appears to be the pre-collected image corpus used to train the flow model prior to any NeRF training.

What this means for real-world 3D scene reconstruction

NeRF's practical Achilles' heel has always been data hunger. In controlled lab settings — a camera rig with 100 angles, a static object, perfect lighting — NeRFs produce stunning results. But in real-world deployments like robotics, AR scene capture, or satellite imagery analysis, you rarely get that luxury. A technique that produces clean outputs from just a few photos dramatically expands where NeRF-based reconstruction is actually usable.

For Google specifically, this fits neatly into applications like Google Maps immersive view, ARCore scene understanding, or any product that needs to reconstruct environments from street-level or user-captured imagery. The use of a pre-trained flow model as a plug-in realism prior also suggests a modular design — the same flow model could potentially supervise many different NeRF variants without retraining from scratch.

Editorial take

This is solid, focused research from a team that includes Jon Barron and Ben Mildenhall — two of the original NeRF authors — so the pedigree here is about as strong as it gets in this space. The core idea of using a normalizing flow model as a learned image prior for NeRF supervision is genuinely well-motivated and addresses a real bottleneck. It's not a flashy consumer feature patent; it's the kind of deep infrastructure work that quietly ends up powering Google's 3D mapping and AR products years later.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.