Google Patents a More Efficient Way to Compress Video Transform Blocks
Google is filing patents on the low-level math that makes video smaller — and this one targets the specific order in which a video encoder reads data blocks, which turns out to matter more than you'd think.
What Google's wavefront scan order actually does
When a video is compressed, the encoder breaks each frame into small blocks of numbers representing brightness and color changes. To shrink those numbers down, the encoder has to decide what order to read them in — and it needs to make smart guesses about each number based on the ones it already processed.
This patent describes a specific scanning pattern — called a wavefront scan order — that moves through a block in a way that keeps several nearby values available for comparison at once. That lets the encoder make better predictions about each value, so it can describe the whole block using fewer bits.
The result is more efficient video compression at the codec level — the kind of incremental gain that, when multiplied across billions of video streams, actually adds up to meaningful bandwidth savings.
How the context model reads neighboring coefficients
The patent covers a method for entropy coding (a lossless compression step that assigns shorter bit sequences to more common values) the quantized coefficients inside a transform block — the grid of numbers you get after applying a DCT or similar transform to a video frame's pixel data.
The key contribution is the co-design of two things that usually get designed separately:
- Scan order: the specific path the encoder takes when walking through the N×N grid of coefficients
- Context model: the statistical model that predicts the probability of each coefficient's value, used to pick optimal Huffman/arithmetic codes
The wavefront scan order is constrained so that for any position (x, y), the three positions directly above (x, y−1), (x, y−2), (x, y−3) and the three positions directly to the left (x−1, y), (x−2, y), (x−3, y) have already been coded in sequence. This means the context model always has a cluster of immediate neighbors — already-coded coefficients spatially close to the current one — available to inform its probability estimate.
By jointly optimizing the scan path and the context model together, the encoder extracts more statistical correlation from adjacent data, reducing redundancy more aggressively than treating them as independent design choices.
What this means for Google's next-gen video codecs
Google owns AV1 and is actively developing AV2 through the Alliance for Open Media. Low-level coding tools like this — scan orders, context models, entropy coders — are exactly the kind of building blocks that get baked into a next-generation open codec. Even a fraction-of-a-percent improvement in compression efficiency matters when you're streaming YouTube at the scale Google operates.
For you as an end user, better entropy coding means the same video quality at lower bitrates, or better quality at the same bitrate. That translates to faster loads, less buffering on slow connections, and lower storage costs for platforms. It's not glamorous, but it's the plumbing that makes video streaming work better.
This is deep codec plumbing — the kind of patent that will never make a product announcement but could quietly ship inside AV2 and benefit every YouTube stream on the planet. The joint optimization angle is the real technical substance here: treating scan order and context model as a single design problem rather than two separate ones is a legitimate research contribution, not just a variation on existing approaches.
Get one Big Tech patent every Sunday
Plain English, intelligent commentary, no hype. Free.
Editorial commentary on a publicly published patent application. Not legal advice.