New Google Patents · Filed Jan 26, 2026 · Published Jun 4, 2026 · verified — real USPTO data

Google Patents an AI That Runs Full-Strength on a Phone Chip

By Patentlyze Team · Updated Jun 5, 2026

Google is patenting a neural network blueprint that does more with less — squeezing strong image-recognition performance out of architectures thin enough to run on a smartphone chip.

FIG. 1A — rendered from the official USPTO publication PDF.

Publication number US 2026/0154533 A1

Applicant Google LLC

Filing date Jan 26, 2026

Publication date Jun 4, 2026

Inventors Andrew Gerald Howard, Mark Sandler, Liang-Chieh Chen, Andrey Zhmoginov, Menglong Zhu

CPC classification 706/27

Grant likelihood Medium

Examiner CENTRAL, DOCKET (Art Unit OPAP)

Status Docketed New Case - Ready for Examination (Mar 2, 2026)

Parent application is a Continuation of 18486534 (filed 2023-10-13)

Document 1 claims

AI/ML

What Google's efficient neural network design actually does

Imagine trying to move furniture through a narrow hallway. Most neural networks work by expanding information into huge, wide layers before squishing it back down — which is expensive in both memory and processing power. Google's design flips that idea on its head.

Instead of starting wide and going narrow, this architecture keeps the entry and exit points of each processing block deliberately thin, while temporarily expanding in the middle only where the actual heavy-lifting computation happens. Think of it like packing your suitcase more efficiently: you spread things out to sort them, then compress them back down before closing the lid.

The result is a network that can be deployed on devices with limited memory and battery life — like smartphones, tablets, or wearables — without needing a data center to back it up. You get capable AI inference running locally, without draining your battery or requiring a cloud round-trip.

How inverted residual blocks cut compute without losing accuracy

The patent describes a convolutional neural network (CNN) architecture built around two key structural ideas.

The first is a linear bottleneck layer — a narrow layer that compresses feature representations without applying a nonlinear activation function (which would otherwise destroy useful information at low dimensionality). Placing these thin layers at the input and output of each processing block preserves information while keeping memory use low.

The second is an inverted residual block. Traditional residual networks (like ResNet) connect wide layers with a shortcut and use narrow bottlenecks only in the middle. Google's design inverts this: the shortcut connection runs between the two thin bottleneck layers, while the middle of the block expands into a wider representation for computation. The expansion uses depthwise separable convolutions — a technique that splits a standard convolution into two cheaper operations (one per channel, one across channels), drastically reducing the number of multiplications required.

Together, these ideas form the backbone of what Google previously released publicly as MobileNetV2. This patent formalizes that architecture with claims covering the structural combination of linear bottlenecks and inverted residuals in a single convolutional block.

What this means for AI running directly on your phone

The practical payoff here is efficient on-device inference — AI that runs on your phone without phoning home. Architectures like this underpin Google's real-time camera features, object detection in Google Lens, and on-device speech processing. The thinner the network, the faster it runs and the less it taxes your battery.

This is also a competitive moat play. As AI moves increasingly to the edge — meaning your device, not a server — whoever has the best low-power inference architecture has a durable advantage. Google shipping this as a patent, years after MobileNetV2's public release, is largely a defensive IP move to formally claim the design space it already pioneered.

Editorial take

This is MobileNetV2 in patent form — a genuinely important architecture that Google already published as research in 2018 and has been shipping in products for years. The technical contribution is real and well-understood by the ML community; the patent filing is mostly a legal formalization of prior art Google already owns. If you're not tracking Google's AI efficiency IP portfolio, this is worth a bookmark, but it's not a new development.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.

Google Patents an AI That Runs Full-Strength on a Phone Chip

What Google's efficient neural network design actually does

How inverted residual blocks cut compute without losing accuracy

What this means for AI running directly on your phone

More from New Google Patents

More in AI/ML

Get one Big Tech patent every Sunday