Apple · Filed Nov 14, 2025 · Published May 28, 2026 · verified — real USPTO data

Apple Patents a System That Builds 3D Scenes From a Single Photo

Apple is patenting a pipeline that takes a single, ordinary 2D photo and reconstructs a full 3D scene from it — no stereo camera, no depth sensor required at capture time.

Apple Patent: Mono Image to 3D Gaussian Splatting — figure from US 2026/0148481 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0148481 A1
Applicant Apple Inc.
Filing date Nov 14, 2025
Publication date May 28, 2026
Inventors Lars Mescheder, Stephan R. Richter, Vladlen Koltun, Afshin Taghavi Nasrabadi, Shiwei Li, Tian Fang, Wei Dong, Xuyang Bai, Yanghai Tsin, Jean-Daniel E. Nahmias, Marcel S. Santos, Peiyun Hu, Bruno Paul Jean-Claude Lecouat, Mingmin Zhen, Amael Delaunoy
CPC classification 345/419
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Dec 4, 2025)
Parent application Claims priority from a provisional application 63724096 (filed 2024-11-22)
Document 20 claims

What Apple's mono-to-3D conversion actually does

Imagine snapping a photo on your iPhone and then being able to walk around the scene in it — seeing the table from the side, the lamp from behind. Today that typically requires specialized hardware or multiple shots from different angles. Apple's patent describes a way to do it from a single flat image.

The system first figures out how far away different parts of the scene are (depth estimation), then uses that depth info to build a 3D Gaussian representation — essentially a cloud of fuzzy, colored blobs that together reconstruct the scene's appearance in three dimensions. Once you have that, software can render the scene from any angle.

The output is rendered using a technique called Gaussian splatting, where each blob is "splat" onto the screen from your current viewpoint. It's fast enough for real-time use and produces surprisingly photorealistic results, which is why it's a hot area in spatial computing right now.

How Apple lifts depth and splats from one flat image

The patent describes a four-step pipeline running on a device with a processor:

  • Obtain a 2D image — any standard single-camera photo of a scene.
  • Estimate depth — the system determines how far each portion of the scene is from the camera, producing a depth map. This can be inferred purely from the image itself using learned models, no LiDAR required.
  • Generate 3D Gaussian data — using both the original image and the depth map, the system builds a set of 3D Gaussians (think: overlapping, semi-transparent blobs in 3D space, each with a position, orientation, size, color, and opacity).
  • Render multiple viewpoints via splatting — the Gaussians are projected ("splatted") onto a 2D screen from any desired camera angle, producing a view-consistent image of the scene from that new perspective.

Gaussian splatting (technically 3D Gaussian Splatting or 3DGS) is a rendering method that has exploded in popularity since 2023 as a fast, differentiable alternative to neural radiance fields (NeRF). Instead of querying a neural network for every pixel, you store scene geometry as explicit Gaussians and rasterize them — much faster at inference time.

The novel angle here is the monocular input: most 3DGS pipelines require many photos from known camera positions. Apple's patent targets reconstruction from a single image, which is a significantly harder problem and much more useful on consumer devices.

What this means for Vision Pro and spatial content

For Vision Pro and future Apple spatial computing hardware, the ability to turn any existing 2D photo library into navigable 3D scenes is a meaningful capability unlock. You wouldn't need to re-shoot memories in spatial format — the device could reconstruct depth and generate a 3D experience from photos you already have.

More broadly, this sits at the intersection of two fast-moving research areas — monocular depth estimation and real-time 3D Gaussian rendering — and Apple filing here signals it's building the full stack in-house. If this ships, it could lower the barrier for developers building spatial content tools on Apple platforms considerably, since the hardest part (multi-view capture) disappears.

Editorial take

This is a legitimately interesting patent, not a routine filing. Monocular 3D reconstruction is one of the harder open problems in computer vision, and pairing it with Gaussian splatting for real-time rendering is the right technical bet for 2025. The inventor list — which includes Vladlen Koltun, one of the most cited researchers in 3D vision — signals this is serious research, not a defensive filing.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.