New Google Patents · Filed Nov 25, 2025 · Published Jun 4, 2026 · verified — real USPTO data

Google Patents an ML-Driven Way to Fill In Video Frames During Compression

When a video codec compresses a frame, it has to predict what each block of pixels looks like before encoding it — and Google thinks machine learning can do that job better than the fixed geometric rules codecs have used for decades.

Google Patent: Data-Driven Intra-Prediction for Video Compression — figure from US 2026/0156246 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0156246 A1
Applicant GOOGLE LLC
Filing date Nov 25, 2025
Publication date Jun 4, 2026
Inventors Joseph Young, Shan Li, In Suk Chong, Raul Blazquez
CPC classification 375/240.08
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Dec 12, 2025)
Parent application Claims priority from a provisional application 63728061 (filed 2024-12-04)
Document 20 claims

What Google's pixel-prediction compression trick actually does

Imagine you're filling in a missing word in a sentence by reading the words around it. Video compression works the same way: when a codec squishes a video file, it tries to predict what each small block of pixels will look like based on the pixels already decoded nearby. The more accurate the prediction, the less data it actually has to store.

Today's codecs like AV1 and HEVC do this with a fixed menu of directional rules — imagine drawing lines from neighboring pixels at various angles to fill in the blank. Google's patent describes replacing (or supplementing) those rigid rules with a data-driven model: one that picks specific neighboring pixels, runs them through a learned process, and produces a custom prediction grid tailored to what the content actually looks like.

The result, in theory, is a tighter prediction and a smaller file — or better quality at the same file size. That matters whether you're streaming 4K video on YouTube or making a video call on Google Meet.

How the model converts neighbor samples into a prediction matrix

The patent describes a new intra-prediction mode (intra-prediction means predicting a block using only pixels from the same frame, as opposed to motion-based prediction that borrows from other frames). Instead of applying a fixed geometric direction, the encoder selects this data-driven mode and executes a pipeline with four steps:

  • Sample selection: Pixels are pulled from predefined fixed locations in the already-decoded neighboring regions around the current block — not just the immediately adjacent row and column, but specific hand-chosen positions.
  • Feature extraction and processing: Those sampled pixel values are treated as features and processed according to the active data-driven mode — likely involving a learned weight matrix or small neural network layer.
  • Matrix conversion: The processed features are converted into a matrix the same size as the block being predicted.
  • Block generation: That matrix becomes the prediction block, which the codec subtracts from the actual pixel values to produce a small residual that is much cheaper to encode.

The 'predefined fixed locations' detail is practically significant: it keeps the approach computationally deterministic and avoids the overhead of a fully dynamic attention mechanism (where the model would have to decide which neighbors to look at on the fly). Both encoder and decoder can agree in advance on exactly which samples to use, which is essential for lossless reconstruction.

What this means for future video codecs and streaming quality

Video compression is a zero-sum game between file size and visual quality, and the geometric intra-prediction rules in today's codecs are essentially decades-old heuristics. Swapping in a learned prediction function — even a lightweight one — can meaningfully close the gap between what the codec guesses and what the image actually contains, translating directly into either lower bitrates or better picture quality at a given bandwidth.

Google controls both a major codec ecosystem (AV1, and active development on AV2 through the Alliance for Open Media) and one of the world's largest video platforms in YouTube. A data-driven intra-prediction mode that survives the complexity-versus-quality tradeoff analysis could realistically appear in a next-generation open codec, affecting billions of video streams. For you as a viewer, that could mean sharper video at the same mobile data cost.

Editorial take

This is genuinely interesting codec research, not a vague software patent grab. The specific engineering choice to use fixed sample locations — trading some flexibility for decoder simplicity — suggests this comes from people who have actually tried to ship a codec, not just theorized about one. Whether it clears the bar for a next-gen codec's complexity budget is another question, but the underlying idea is sound and the prior art space here is competitive enough that Google likely wants it on the books.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.