Microsoft · Filed Dec 29, 2025 · Published May 7, 2026 · verified — real USPTO data

Microsoft Patents Motion-Compensated Compression for 3D Point Cloud Video

Your phone's video codec has used motion compensation for decades — storing only what changed between frames instead of every pixel every time. Microsoft is now applying that same logic to 3D point clouds, the format that powers volumetric video, LiDAR scans, and immersive spatial experiences.

Microsoft Patent: 3D Point Cloud Video Compression — figure from US 2026/0129233 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0129233 A1
Applicant Microsoft Technology Licensing, LLC
Filing date Dec 29, 2025
Publication date May 7, 2026
Inventors Philip A. Chou, Ricardo de Queiroz
CPC classification 375/240.16
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Jan 26, 2026)
Parent application is a Continuation of 17682018 (filed 2022-02-28)
Document 20 claims

What Microsoft's 3D video compression patent actually does

Imagine watching a live 3D hologram of a basketball game. Unlike a flat video, this is a full three-dimensional cloud of millions of tiny dots — called voxels — that together form the players, the court, the ball. Sending all of that data every single frame would be enormous. You'd need a firehose of bandwidth just to stream a few seconds.

Microsoft's patent borrows a classic video compression trick: instead of resending the entire 3D scene every frame, only send what changed. It looks at a previous frame stored in memory, predicts what the current frame probably looks like based on movement, and only encodes the difference. This is called inter-frame coding, and it's the same reason your Netflix stream doesn't consume terabytes per hour.

The patent also adds a smart fallback: if motion compensation isn't efficient for a particular chunk of the scene, the system can switch to intra-frame mode — encoding that chunk independently, from scratch — and uses a cost-benefit calculation to pick the better option automatically. There's even a custom error-correction filter tuned specifically for the quirks of 3D voxel data.

How inter-frame prediction works on voxel blocks

The patent describes a codec system for dynamic point cloud video — 3D scenes captured as sequences of voxel frames, like a time-series of LiDAR or depth-camera snapshots.

The core mechanism works like this:

  • Block partitioning: Each 3D frame is divided into rectangular blocks of voxels (x × y × z regions, each dimension at least 2 units wide). Only occupied blocks — those containing actual scene geometry — are processed.
  • Mode signaling: Each block gets a syntax element in the bitstream declaring whether it's encoded in intra-frame mode (self-contained, no reference to other frames) or inter-frame mode (predicted from a previously encoded reference frame).
  • Inter-frame prediction: For inter-coded blocks, the encoder finds a matching predicted block from a reference frame buffer (think of it as a decoded frame held in memory), computes the 3D difference between the current block and that prediction, and encodes only the residual — the delta. The decoder reconstructs the block by applying that delta back onto the reference.
  • Rate-distortion optimization: The system evaluates a threshold comparing encoding cost vs. quality loss for both modes, and picks whichever is more efficient for each block.

A voxel-distortion-correction filter is applied post-decode to clean up compression artifacts specific to 3D voxel data — errors that standard 2D video filters aren't designed to handle.

What this means for volumetric video and spatial computing

Volumetric video — 3D content you can walk around inside in AR or VR — is one of the most data-hungry media formats ever conceived. Without efficient compression, streaming real-time point clouds at scale is essentially impossible on current infrastructure. This patent addresses that bottleneck directly, adapting the motion-compensation techniques that made flat video streaming viable and applying them to the third dimension.

For Microsoft, this connects naturally to its investments in Azure spatial computing, HoloLens technology, and its broader push into mixed reality and 3D collaboration tools. If volumetric video ever becomes a mainstream format — for sports broadcasts, remote presence, or industrial digital twins — codec efficiency like this is the unsexy prerequisite that makes it practical. You may never think about it, but it's the kind of infrastructure work that determines whether a technology actually ships.

Editorial take

This is foundational codec engineering — not a flashy AI trick, but the kind of painstaking compression work that determines whether an entire media format survives contact with real networks. Microsoft filing this now, with spatial computing heating up, suggests they're quietly building the plumbing for volumetric video well before most people agree on what volumetric video even is. That's smart positioning, even if the patent itself reads like a graduate-level signal processing textbook.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice. Patentlyze may earn a commission if you click an affiliate link and make a purchase. This doesn't affect what we cover or how we cover it.