Qualcomm · Filed Nov 20, 2024 · Published May 21, 2026 · verified — real USPTO data

Qualcomm Patents an Adaptive Aerial-View Mapping System for Sensor Processing

By Patentlyze Team · Updated Jul 10, 2026

Qualcomm has filed a patent describing a system that builds a bird's-eye-view map of an environment from camera images — then reuses a targeted spatial 'mask' to make subsequent frames faster and more focused. It's a clever way to avoid reprocessing the same parts of a scene over and over.

Figure from the official USPTO publication.

Publication number US 2026/0141687 A1

Applicant QUALCOMM Incorporated

Filing date Nov 20, 2024

Publication date May 21, 2026

Inventors Per CRONVALL, Gustav Nils Ture PERSSON, Gustav Lars Henrik JAGBRANT

CPC classification 382/104

Grant likelihood Medium

Examiner CENTRAL, DOCKET (Art Unit OPAP)

Status Docketed New Case - Ready for Examination (Dec 18, 2024)

Document 20 claims

AI vision

What Qualcomm's aerial-view feature masking actually does

Imagine you're driving and your car's cameras are constantly trying to understand everything around you — the road, the curb, the pedestrians. Doing that from scratch on every single camera frame is expensive. Qualcomm's patent describes a smarter approach: build a detailed overhead map of the scene once, then mark off which regions actually need close attention on the next pass.

That marked region is called a mask. Once the system knows a particular patch of road or intersection matters, it uses that mask to focus processing on just that area in future image frames — instead of re-analyzing the whole scene every time.

The result is a pipeline that is both more efficient and more spatially aware. Rather than treating each frame as an isolated snapshot, it carries forward contextual knowledge about where things are happening — which is exactly the kind of persistent spatial understanding that autonomous systems and robotics need.

How the encoder and mask pipeline reuse spatial context

The patent describes a two-stage pipeline running on an image-processing device — likely an edge chip like one of Qualcomm's Snapdragon Ride or similar automotive-grade processors.

Stage 1 — Build the aerial view: An encoder (a neural network that compresses images into compact feature representations) processes a batch of camera images and extracts image features. Those features are then transformed into aerial view features — think of this as a top-down, BEV (Bird's Eye View) representation where each feature corresponds to a specific real-world region in the environment.

Stage 2 — Generate and apply the mask: The system identifies a region of interest and creates a first mask tied to that region and its associated aerial view features. When the next batch of camera images arrives, the encoder generates new image features — but instead of processing all of them equally, the mask is applied to focus computation specifically on the features relevant to that pre-identified region.

Encoder generates image features from raw camera frames
Features are projected into a top-down aerial view space
A spatial mask is created for a region of interest
The mask guides processing of subsequent frames, reducing redundant work

This is a form of temporal feature reuse — leveraging what you learned in the last time step to reduce work in the current one, which is a well-established efficiency strategy in video and autonomous-driving neural networks.

What this means for autonomous vehicles and edge AI sensors

For autonomous vehicles, drones, and robotics, real-time spatial understanding is one of the hardest computational problems to solve at the edge. Every millisecond of latency matters, and every watt of power consumed is a constraint. A system that can selectively reprocess only the parts of a scene that are relevant — rather than brute-forcing the full image set every frame — is meaningfully more deployable on power-constrained hardware.

Qualcomm is positioning itself as a key supplier of automotive and robotics AI chips. This patent fits squarely into that strategy: it's the kind of efficiency-focused perception work that makes on-device inference viable without requiring a data-center-class GPU in the trunk of your car. If this approach lands in production silicon, it could reduce the compute burden of BEV perception pipelines — one of the most resource-intensive parts of any autonomous driving stack.

Editorial take

This is solid, unglamorous perception engineering — the kind of work that separates chips that can actually run full AV stacks from ones that can't. The mask-based temporal reuse idea isn't wildly novel in concept, but the specific claim around dynamically generating and applying aerial-view masks across encoder passes is a concrete technical contribution worth watching. Qualcomm clearly wants to own the inference layer for autonomous systems, and patents like this are the building blocks.

Which company should we read for you?

We track 17 companies here. Pro is the same weekly breakdown for any company you choose, delivered privately. Type a name and we'll scope it and send you a quote.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.

Qualcomm Patents an Adaptive Aerial-View Mapping System for Sensor Processing

What Qualcomm's aerial-view feature masking actually does

How the encoder and mask pipeline reuse spatial context

What this means for autonomous vehicles and edge AI sensors

More from Qualcomm

More in AI vision

Get one Big Tech patent every Sunday