IBM · Filed Nov 16, 2024 · Published May 21, 2026 · verified — real USPTO data

IBM Patents a Hybrid AI Router That Assigns Sub-Tasks to Specialist Models

Instead of throwing every question at one giant AI model, IBM's patent describes a system that figures out which specialist model is best suited for each piece of a complex task — and then sequences them like an assembly line.

IBM Patent: Hybrid Mixture-of-Experts Model-as-a-Service — figure from US 2026/0141274 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0141274 A1
Applicant International Business Machines Corporation
Filing date Nov 16, 2024
Publication date May 21, 2026
Inventors Zhong Fang Yuan, Tong Liu, Wen Wang, He Li, Li Juan Gao
CPC classification 706/10
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Dec 13, 2024)
Document 20 claims

What IBM's model-routing system actually does

Imagine you're trying to plan a cross-country road trip. One person is great at picking hotels, another is great at finding gas stations, and a third is great at estimating drive times. Instead of asking one person to do all three, you split the work based on who's best at what.

That's essentially what IBM is patenting here. When you send a complex request to this system, it breaks it into smaller sub-tasks and automatically routes each one to the AI model best suited to handle it. A coordinating "router" model — delivered as a cloud service — figures out the assignments and the order in which they should run.

The system learns which models are good at which tasks by watching them work through problems and studying their chains of thought — the step-by-step reasoning each model produces. That reasoning becomes the basis for future routing decisions.

How IBM maps tasks to models using chain-of-thought

The patent describes a Hybrid Mixture of Experts (MoE) architecture delivered as a cloud-based service. The core idea is that different ML models have different strengths, and a smart routing layer should exploit those differences rather than relying on a single general-purpose model.

Here's how the pipeline works:

  • A set of ML models is run on sample questions, and each one produces a chain of thought (CoT) — essentially a written-out reasoning trace showing how it arrived at its answer.
  • Those CoT outputs are analyzed to build a task-to-model mapping: a lookup table that associates types of tasks with the models most capable of handling them.
  • When a new complex request arrives, a SaaS-hosted routing model decomposes it into sub-tasks, consults the mapping, and selects the right specialist model for each piece.
  • The system then outputs machine-readable instructions defining both the model assignments and the execution sequence — which model runs first, which runs second, and so on.

The "hybrid" label comes from mixing a cloud-hosted orchestration layer with potentially on-premise or distributed specialist models. The routing model itself is the coordinating intelligence; the specialist models are the workers.

What this means for enterprise AI-as-a-service costs

For enterprises running AI workloads, the cost and quality tradeoffs of large general-purpose models are real. A single large language model trained to do everything often underperforms a smaller specialist on narrow tasks — and costs more to run per token. IBM's approach suggests a future where AI infrastructure is decomposed: cheap, focused models handle the tasks they're best at, orchestrated by a lightweight router.

This patent also fits neatly into IBM's broader watsonx platform strategy, where the pitch is enterprise-grade AI with governance and modularity. A system that can map tasks to models — and show its reasoning via chain-of-thought traces — is easier to audit and explain to compliance teams than a monolithic black-box model.

Editorial take

This is a solid, practical patent that addresses a real enterprise AI problem: most complex tasks don't need one model that does everything, they need the right model for each piece. IBM is essentially patenting a form of AI orchestration middleware, and given the direction the industry is heading with agentic frameworks and multi-model pipelines, this is a legitimate architectural bet worth watching.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.