Qualcomm · Filed May 21, 2025 · Published Jun 4, 2026 · verified — real USPTO data

Qualcomm Patents an AI That Blends Any Image's Style With Any Other's Content

Qualcomm has filed a patent for a system that can take the visual style of one image and apply it to the content of another — without retraining the underlying AI model for each new combination.

Qualcomm Patent: Zero-Shot AI Style Transfer for Images — figure from US 2026/0154856 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0154856 A1
Applicant QUALCOMM Incorporated
Filing date May 21, 2025
Publication date Jun 4, 2026
Inventors Mohammad Reza KARIMI DASTJERDI, Kartikeya BHARDWAJ, Shubhankar Mangesh BORSE, Ankita NAYAK, Edward TEAGUE, Fatih Murat PORIKLI
CPC classification 345/581
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Jun 20, 2025)
Parent application Claims priority from a provisional application 63727121 (filed 2024-12-02)
Document 27 claims

What Qualcomm's style-content image blending actually does

Imagine you have a photo of your dog and a painting in the style of Van Gogh. You want a new image that looks like your dog painted by Van Gogh. Today, making that work convincingly usually requires a lot of compute, or fine-tuning a model specifically for that style — neither of which is practical on a phone.

Qualcomm's patent describes a system with two separate "adapters" — lightweight add-ons to an existing image-generation AI. One adapter reads a style image and encodes what makes it visually distinctive (brushstrokes, color palette, texture). The other reads a content image and encodes what's actually depicted (your dog, a city skyline, a coffee cup). The two encoded signals are then merged and fed into the main AI model to generate a new image.

The key phrase in the patent is zero-shot — meaning the system doesn't need to see a specific style during training to transfer it later. You hand it a new style image at runtime, and it figures it out on the fly. That's what makes this potentially useful at the edge, on devices like phones or XR headsets where retraining isn't an option.

How the dual adapters encode and merge style and content

The system sits on top of a pretrained diffusion or generative model (the kind that powers tools like Stable Diffusion). Rather than modifying the base model, it adds two specialized adapter modules:

  • Style adapter: Takes an input style image and produces a style embedding — a compact numerical representation capturing the visual aesthetic of that image (color, texture, brush feel, lighting mood).
  • Content adapter: Takes an input content image and produces a content embedding — encoding the semantic subject matter and structure (what objects are there and how they're arranged).
  • Combined embedding: The two embeddings are merged into a single representation that carries both signals simultaneously.

This combined embedding is then passed to the pretrained model, which generates an output image that reflects both the subject matter of the content image and the visual aesthetic of the style image.

The zero-shot aspect is the critical engineering claim here. Traditional style transfer requires either a separate model per style or an expensive fine-tuning step. By training the adapters to generalize across arbitrary styles and content at inference time — meaning when you're actually using the app, not when you're training — the system can handle style-content pairs it's never seen before. This is the kind of architecture that could plausibly run efficiently on Qualcomm's Snapdragon silicon, where the base model is baked in and the adapters do the customization work at runtime.

What this means for on-device AI image editing

Style transfer has been around for years, but it's almost always been a cloud-side or desktop operation. The reason Qualcomm — a chip company — is filing patents in this space is clear: they want this running on-device, in real time, on Snapdragon-powered phones, AR glasses, and PCs. If the adapter architecture is efficient enough to run at the edge, you could apply any style to any image directly in a camera or photo-editing app without sending data to a server.

The broader implication is that on-device generative AI is heading toward modular, adapter-based architectures — where a single frozen base model handles the heavy lifting and small swappable modules handle personalization. This patent is one piece of that larger architectural puzzle Qualcomm is clearly assembling.

Editorial take

This is a technically solid patent that reflects where the serious on-device AI work is happening right now — not in making bigger models, but in making small adapter layers that let a frozen base model do new things without retraining. Qualcomm filing this is a direct signal that they're building the silicon and software stack to run generative image AI locally, not just as a demo but as a real product capability. Worth watching.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.