Nvidia · Filed Dec 18, 2024 · Published Jun 18, 2026 · verified — real USPTO data

Nvidia Patents a Method to Stop AI Chips From Waiting on Shared Data

By Patentlyze Team · Updated Jun 19, 2026

When you're training a massive AI model across hundreds of chips, the slowest part is often just moving data around. Nvidia's new patent tackles that by strategically copying the right data to the right places — before the chips even ask for it.

FIG. 1A — rendered from the official USPTO publication PDF.

Publication number US 2026/0170317 A1

Applicant NVIDIA Corporation

Filing date Dec 18, 2024

Publication date Jun 18, 2026

Inventors Mohammad Amin Nabian

CPC classification 706/21

Grant likelihood Medium

Examiner CENTRAL, DOCKET (Art Unit OPAP)

Status Docketed New Case - Ready for Examination (Jun 27, 2025)

Document 20 claims

AI/ML

What Nvidia's shared-data partitioning actually does

Imagine a massive kitchen with dozens of cooks, each responsible for a different dish. If two cooks keep reaching across the counter to grab the same ingredient, that slows everything down. The smarter fix: give each cook their own copy of the ingredients they share most.

That's essentially what this Nvidia patent proposes for AI chips. When training or running a large AI model, the work is split across many specialized processors (called accelerators). Some of those processors need the same data. Instead of making them wait to share it, Nvidia's system figures out how much overlap exists between processors and then duplicates that overlapping data in advance.

The result is that each processor spends less time waiting and more time actually computing — which, at the scale of modern AI data centers, can translate to meaningfully faster training runs.

How Nvidia decides what data to duplicate and where

The patent describes a processor circuit that manages how a dataset is split — or partitioned — across multiple accelerators (the specialized chips that do the heavy lifting in AI workloads). The key insight is that partitions aren't kept strictly separate. Instead, some data is intentionally duplicated across partitions.

The duplication decision is driven by activations — the intermediate values a neural network produces as data flows through its layers (think of them as the running calculations the network hands off from one stage to the next). When two accelerators need the same activations to do their work, the system identifies that overlap and copies the relevant data to both, rather than forcing one chip to wait on the other.

This applies to both:

Training — teaching an AI model from scratch using large datasets
Inferencing — running an already-trained model to generate outputs (like answering a question or detecting an object)

The amount of duplication scales with the degree of sharing — chips that overlap heavily get more duplicated data; chips that rarely share data don't waste memory on unnecessary copies.

What this means for large-scale AI training clusters

In large AI clusters — the kind Nvidia sells to cloud providers and AI labs — inter-chip communication is often the limiting factor, not raw computing power. When chips have to constantly request data from neighbors, they idle. This patent is Nvidia's attempt to pre-empt that bottleneck by making each accelerator more self-sufficient.

For you as an end user, faster and cheaper AI training means AI models that improve more quickly and at lower cost. For Nvidia's customers — companies running massive GPU clusters — this kind of optimization directly affects how much they spend per training run. If this approach ships in a future driver or system-level software stack, it could quietly make existing hardware meaningfully more efficient without any new silicon required.

Editorial take

This is unglamorous but genuinely useful infrastructure work. Data movement is a real, well-documented bottleneck in large-scale AI training, and targeted duplication is a sensible engineering answer. It won't make headlines like a new GPU architecture, but the kind of software-level optimization this patent describes is exactly what separates efficient AI clusters from wasteful ones.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.

Nvidia Patents a Method to Stop AI Chips From Waiting on Shared Data

What Nvidia's shared-data partitioning actually does

How Nvidia decides what data to duplicate and where

What this means for large-scale AI training clusters

More from Nvidia

More in AI/ML

Get one Big Tech patent every Sunday