Google · Filed Oct 30, 2025 · Published Apr 30, 2026 · verified — real USPTO data

Google Patents a Mixture-of-Experts AI System for Smarter Video Compression

Google is patenting an AI-driven video compression system that uses a team of machine learning models — not just one — to decide how to shrink each block of video more intelligently by looking at what changed between frames.

Google Patent: AI Mixture-of-Experts Video Compression — figure from US 2026/0122256 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0122256 A1
Applicant Google LLC
Filing date Oct 30, 2025
Publication date Apr 30, 2026
Inventors David Charles Minnen, Sung Jin Hwang, Elliott Michael Karpilovsky, Debargha Mukherjee
CPC classification 375/240.12
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Jan 22, 2026)
Parent application is a Continuation of 18286574 (filed 2023-10-12)

What Google's AI video compression actually does

Imagine you're watching a video call and the background barely moves, but your friend's hand is waving around. A smart compression system shouldn't treat those parts of the video the same way — the still background can be compressed aggressively, while the moving hand needs more detail preserved.

That's the core idea behind Google's new patent. Instead of applying a single, one-size-fits-all compression rule, it uses an ensemble of machine learning models (a so-called "mixture of experts") to analyze each small block of a video frame. Each model specializes in different kinds of content or motion patterns.

Critically, the system doesn't just look at the current frame in isolation. It compares each block to the same region in the previous frame, using the edges and borders of those blocks as key signals. That lets it make smarter predictions about what information actually needs to be transmitted — and what can be inferred.

How the ensemble model compares blocks across frames

The patent describes a video codec (compression/decompression system) built around a mixture-of-experts (MoE) architecture — a design where multiple specialized ML models each handle different parts of a problem, and a routing mechanism decides which expert to use for a given input.

The system divides each video frame into a grid of blocks (small rectangular regions, a standard approach in video codecs like H.264 or AV1). It then processes frames in sequential pairs — always looking at the current frame alongside the prior frame.

For each block, the prediction model receives four key inputs:

  • The first and second borders (edges) of the current block in the current frame
  • The first and second borders of the corresponding block in the prior frame
  • The actual pixel content of the prior frame's block

By feeding in border information from both frames, the model can infer how a region has changed — capturing motion and texture shifts — without needing to transmit every pixel. The ensemble approach means different experts can specialize in, say, high-motion blocks versus static backgrounds, potentially outperforming a single monolithic model on diverse video content.

What this means for streaming and video quality

Video compression is one of the most computationally expensive and strategically important problems in tech. Google serves YouTube — one of the world's largest video platforms — and any efficiency gain in its codec translates directly into lower bandwidth costs and better quality at the same bitrate for billions of streams. Even a few percentage points of compression improvement at YouTube's scale is worth enormous infrastructure savings.

The MoE framing is notable because it mirrors the architecture trend reshaping large language models (think GPT-4 and Gemini). Applying that same "team of specialists" logic to a low-level signal processing problem like video compression is a genuinely interesting crossover. If this approach works well in practice, it could eventually influence open standards like AV2 or future VP-series codecs that Google actively develops.

Editorial take

This is a legitimately interesting patent — not because MoE is a new idea, but because applying it to per-block video prediction is a concrete, testable engineering bet. Google has the YouTube traffic to validate it at scale, and the team includes Debargha Mukherjee, a key figure behind the AV1 codec. Worth keeping an eye on.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice. Patentlyze may earn a commission if you click an affiliate link and make a purchase. This doesn't affect what we cover or how we cover it.