New Google Patents · Filed Feb 13, 2026 · Published Jun 25, 2026 · verified — real USPTO data

New Patent Turns Unclear or Stuttered Speech Into Clean Audio for Each Speaker

By Patentlyze Team · Updated Jun 26, 2026

Google is patenting a system that takes the speech of someone with a stutter, dysarthria, or other atypical speech pattern and converts it, in real time, into a clean, fluent audio version of what they said. Each person gets their own tiny personalized AI module that teaches the main model what their speech sounds like.

FIG. 1A — rendered from the official USPTO publication PDF.

Publication number US 2026/0179607 A1

Applicant Google LLC

Filing date Feb 13, 2026

Publication date Jun 25, 2026

Inventors Fadi Biadsy, Mr. Youzheng Chen, Mr. Xia Zhang, Mr. Oleg Rybakov, Andrew M. Rosenberg, Pedro J. Moreno Mengibar

CPC classification 704/232

Grant likelihood Medium

Examiner CENTRAL, DOCKET (Art Unit OPAP)

Status Docketed New Case - Ready for Examination (Mar 20, 2026)

Parent application is a Continuation of 18184630 (filed 2023-03-15)

Document 20 claims

AI/ML

How Google's speech-clearing system actually works

Imagine you have a condition like ALS or cerebral palsy that affects how your voice sounds. Your words come out slurred, halting, or otherwise different from what most voice assistants expect. Most speech-to-speech tools either fail completely or produce garbled output because they were trained on typical voices.

Google's patent describes a system that assigns each person with atypical speech their own small, dedicated sub-model. When you speak, the system looks up your personal ID, loads that sub-model, and uses it to guide the main AI as it listens to you. The result is a clean, fluent audio version of what you actually said, preserving your intended words.

The clever part is the design: the personalized piece is small, not a whole separate model for every user. Google calls these "residual adapters" -- lightweight add-ons slotted into the existing AI architecture. That means the system could scale to many users without requiring enormous amounts of extra computing power for each one.

How the encoder adapters reshape each speaker's audio

The patent describes a speech conversion model built around a standard encoder-decoder architecture (the encoder listens and encodes the audio; the decoder generates the output speech). What makes this unusual is how personalization is layered on top.

The encoder is built from a stack of self-attention blocks (layers of an AI that weigh different parts of the audio against each other to find patterns). Between those blocks, Google inserts residual adapters -- small neural network modules that nudge the encoder's understanding of the audio without replacing the whole model. "Residual" means the adapter adds a small correction on top of the existing signal rather than replacing it entirely.

The system receives audio from a target speaker plus a speaker identifier (a unique ID for that person).
It uses that ID to load the correct sub-model for that speaker.
The encoder processes the audio through the activated adapters, producing modified, "biased" encoded audio.
The decoder then generates clean, fluent synthesized speech from that biased representation.

The patent is especially focused on atypical speech: dysarthria (motor-impaired speech), stuttering, or other patterns that diverge from the training data most voice AI is built on. The sub-models teach the base model what a specific person's speech patterns look like, so it can normalize them accurately.

What this means for people with speech disabilities

For the hundreds of millions of people worldwide with speech-affecting conditions, standard voice interfaces are often frustrating or unusable. A system like this could make voice-controlled devices, communication aids, and transcription tools genuinely accessible to people who have historically been underserved by one-size-fits-all AI.

From a technical strategy standpoint, the modular design is the real story. By making each user's personalization a small, stackable adapter rather than a full model, Google is describing a system that could realistically run at scale -- think Google Assistant or Pixel phone accessibility features -- without a prohibitive jump in server costs. If this ships in any form, it would be a meaningful step forward in how AI handles the full range of human voices.

Editorial take

This is one of those patents that's easy to overlook because it sounds like infrastructure, but the accessibility angle is genuinely important. Google is describing a real architectural solution to a real problem: voice AI that simply doesn't work for a large population of users. The scalable adapter design is the detail worth caring about -- it's the difference between a research demo and something that could actually run on Google's servers at consumer scale.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.

New Patent Turns Unclear or Stuttered Speech Into Clean Audio for Each Speaker

How Google's speech-clearing system actually works

How the encoder adapters reshape each speaker's audio

What this means for people with speech disabilities

More from New Google Patents

More in AI/ML

Get one Big Tech patent every Sunday