Samsung · Filed Sep 29, 2025 · Published Jun 18, 2026 · verified — real USPTO data

Samsung Patents Technology That Automatically Lowers Video Audio When Voice Narration Speaks

By Patentlyze Team · Updated Jun 19, 2026

When your phone reads text aloud while a video is playing, the two audio streams usually just fight each other. Samsung's new patent describes a system that automatically separates and rebalances those sounds so you can actually hear both.

FIG. 1A — rendered from the official USPTO publication PDF.

Publication number US 2026/0171105 A1

Applicant SAMSUNG ELECTRONICS CO., LTD.

Filing date Sep 29, 2025

Publication date Jun 18, 2026

Inventors Yoonjae LEE, Dongwoo KIM, Hyeonsik JEONG, Inwoo HWANG, Sunmin KIM, Hanki KIM

CPC classification 704/202

Grant likelihood Medium

Examiner CENTRAL, DOCKET (Art Unit OPAP)

Status Docketed New Case - Ready for Examination (Oct 17, 2025)

Parent application is a Continuation of PCTKR2025014393 (filed 2025-09-16)

Document 20 claims

Software

What Samsung's TTS audio-mixing system actually does

Imagine you're watching a YouTube video with your TV's screen reader turned on — the kind of feature that reads out menu text or on-screen captions for people with visual impairments. Right now, the narrator's voice and the video's audio just stack on top of each other, making both harder to understand.

Samsung's patent proposes a smarter mixing approach. When the text-to-speech feature kicks in, the device automatically splits the video's audio into separate layers — for example, separating the dialogue or narration from the background music and sound effects. It then turns down whichever layer is most similar in type to the screen reader's voice (probably speech), while leaving the other layers at a more normal volume.

The result is a blend where the screen reader's voice has room to be heard clearly, without the video going completely silent. Think of it like a DJ automatically ducking one instrument when the vocalist starts singing — except your phone is doing it in real time for accessibility.

How the device splits and reweights video audio on the fly

The system activates when a device's text-to-speech (TTS) function turns on — the feature commonly used by screen readers to narrate on-screen content for blind or low-vision users. At that moment, the device doesn't just layer the TTS voice on top of whatever audio is already playing.

Instead, it separates the video's existing audio into distinct audio objects — think of these as isolated tracks, similar to how a recording studio keeps vocals, drums, and guitar on separate channels. The patent describes at least two objects: one that matches the same type of sound as the TTS voice (most likely speech), and one that doesn't (most likely background music or ambient sound).

Each object gets a different weight — essentially a volume multiplier. The object that sounds like the TTS voice gets turned down more aggressively, since it would compete most directly with what the user needs to hear. The other object is treated less harshly. The final output mixes all three streams — the quieter speech-like layer, the less-reduced background layer, and the TTS voice — into a single balanced audio signal.

The patent doesn't specify exactly how the audio separation works, but audio-object separation is an established field that uses machine-learning models trained to isolate voice from non-voice content in real time.

What this means for accessibility on Samsung devices

Screen readers are a core accessibility tool for millions of people, but audio conflicts have always been a rough edge. On most devices today, you either mute the app you're in or suffer through two audio streams clashing. Samsung's approach would let you keep video context — background music, sound effects — while still clearly hearing the narrator. That's a meaningful quality-of-life improvement for blind and low-vision users who rely on TTS daily.

It also signals that Samsung is investing engineering effort in accessibility at the audio layer, not just at the UI level. If this ships in Galaxy phones or Samsung TVs, it could set a bar that other platforms feel pressure to meet — particularly as regulators in the EU and US push harder on accessibility standards for consumer electronics.

Editorial take

This is a genuinely useful accessibility patent that solves a real, annoying problem most sighted users have never thought about. It's not flashy technology, but the specificity of the approach — weighting audio objects by type rather than just applying a global duck — suggests Samsung has actually thought through what makes screen-reader audio hard to use. Worth watching to see if it surfaces in Galaxy devices or Samsung's TV platform.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.

Samsung Patents Technology That Automatically Lowers Video Audio When Voice Narration Speaks

What Samsung's TTS audio-mixing system actually does

How the device splits and reweights video audio on the fly

What this means for accessibility on Samsung devices

More from Samsung

More in Software

Get one Big Tech patent every Sunday