Meta · Filed Dec 17, 2025 · Published Jun 25, 2026 · verified — real USPTO data

Meta Patent Uses Dual AI Models to Isolate Clean Audio Signals

By Patentlyze Team · Updated Jun 26, 2026

Most audio cleanup tools treat noise removal as a single-step job. Meta's new patent splits the work across two AI models that were trained together to hand off the task in sequence, and that division of labor appears to produce cleaner results.

FIG. 1A — rendered from the official USPTO publication PDF.

Publication number US 2026/0179640 A1

Applicant Meta Platforms Technologies, LLC

Filing date Dec 17, 2025

Publication date Jun 25, 2026

Inventors Ashutosh Pandey, Juan Azcarreta Ortiz, Ali Aroudi, Buye Xu, Cagdas Bilen, Jacob Ryan Donley, Sanha Lee, Ke Li, Daniel Davis Eugene Wong

CPC classification 704/205

Grant likelihood Medium

Examiner CENTRAL, DOCKET (Art Unit OPAP)

Status Docketed New Case - Ready for Examination (Jan 15, 2026)

Parent application Claims priority from a provisional application 63737415 (filed 2024-12-20)

Document 20 claims

AI/ML

What Meta's two-step AI audio cleaner actually does

Imagine you're on a video call and a lawnmower starts up outside. Your device's audio system has to figure out which sounds are your voice and which are noise, then strip out the noise without making your voice sound robotic or hollow. That's a hard problem, and most systems make tradeoffs.

Meta's patent describes a smarter split: a first AI model listens to the raw audio and makes an educated guess about what the target sound (say, your voice) probably sounds like. That guess is then used to configure a traditional audio filter, which does a first pass of cleanup. A second AI model takes that filtered audio and polishes it further.

The key twist is that both models are trained together from the start, so they learn to cooperate rather than work independently. The first model learns to make guesses that are useful for the filter, and the second model learns to handle whatever the filter leaves behind.

How the estimate-then-filter pipeline is structured

The system chains two machine-learning models with a classical signal-processing step between them.

The first model takes raw input audio and generates an "intermediate estimate" of what the target signal (usually speech) looks like on its own.
That estimate, combined with the original audio, is used to calculate the parameters of an adaptive filter (a filter whose behavior changes based on the input, rather than being fixed).
The adaptive filter runs on the original audio to produce a cleaner, filtered version.
The second model takes that filtered audio and generates a final enhanced output.

The defining feature is joint end-to-end training, meaning both models are trained simultaneously on the same objective: make the final output sound as close to the clean target signal as possible. In most conventional pipelines, components are designed or trained separately. Here, the first model's behavior is shaped by how well the second model ultimately performs, so the two learn to complement each other.

Adaptive filters are well-established in audio engineering, but using a neural network to dynamically compute their parameters in real time is a more recent approach. Wrapping that inside a second neural network that refines the output adds another layer of correction for artifacts the filter introduces.

What this means for Meta's AR and VR audio hardware

Meta makes the Quest VR headsets and is deep in development on AR glasses (the Ray-Ban Meta line already has microphones). Both product categories depend heavily on audio quality: voice commands, calls, spatial audio, and passthrough communication all require good noise separation in real time on constrained hardware.

A two-model approach like this could allow Meta to tune the tradeoff between the two stages depending on the device. A lightweight first model and filter could run on a low-power wearable chip, with the second model adding refinement when more compute is available. Whether this shows up in a headset, glasses, or a future device isn't clear from the patent alone, but the direction is obvious: Meta wants cleaner voice audio without waiting for a single massive AI model to do all the work.

Editorial take

This is a solid audio-ML patent, not a flashy one. The two-model-plus-adaptive-filter structure is a genuine engineering idea, and the joint training angle is the part worth paying attention to. If you care about how AR glasses will handle calls in a noisy coffee shop, this is exactly the kind of foundational work that makes that possible.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.

Meta Patent Uses Dual AI Models to Isolate Clean Audio Signals

What Meta's two-step AI audio cleaner actually does

How the estimate-then-filter pipeline is structured

What this means for Meta's AR and VR audio hardware

More from Meta

More in AI/ML

Get one Big Tech patent every Sunday