Samsung · Filed Jan 8, 2025 · Published Jun 25, 2026 · verified — real USPTO data

Samsung Patents Technology That Stops Processors From Stalling While Teaching Computers New Skills

By Patentlyze Team · Updated Jun 26, 2026

FIG. 1A — rendered from the official USPTO publication PDF.

Publication number US 2026/0178912 A1

Applicant SAMSUNG ELECTRONICS CO., LTD.

Filing date Jan 8, 2025

Publication date Jun 25, 2026

Inventors Ziyan ZHAO, Beomsig CHO, Dan CAO, Kun DOU

CPC classification 706/16

Grant likelihood Medium

Examiner CENTRAL, DOCKET (Art Unit OPAP)

Status Docketed New Case - Ready for Examination (Feb 19, 2025)

Document 21 claims

Hardware

What Samsung's GPU data pre-loading system actually does

When you train a large AI model, the process involves running data through many layers, forward and then backward, over and over. The problem is that modern AI models are so large they don't fit entirely in a GPU's fast memory. That means the system constantly has to fetch data from slower storage, and the GPU sits idle waiting. That's expensive and slow.

Samsung's patent describes a coordination system where a central "host" chip orchestrates two things at once: it tells the GPU which data to load right now, and it simultaneously tells the storage drive to start pulling the next batch of data into a faster middle layer of memory. By the time the GPU finishes with the current data, the next chunk is already waiting nearby.

The key detail is that the system checks how much room is available in that middle memory layer before deciding what to pre-load, so it never accidentally overfills it. It's essentially a pipeline manager for AI training data.

How the host coordinates DRAM prefetch and GPU loading

The patent targets the backward propagation phase of neural network training (the step where the model calculates errors and adjusts its internal weights). This phase is particularly memory-hungry because it needs to revisit data that was generated during the forward pass, data that may have already been pushed out of fast GPU memory to make room.

The system has three memory tiers in play:

NAND flash storage: the slowest but largest layer, where overflow data lives on the SSD
DRAM inside the storage device: a faster intermediate buffer sitting between the SSD and the GPU
GPU on-chip memory (HBM): the fastest layer, where the GPU actually does its work

The host apparatus (essentially the CPU or system controller) acts as a traffic coordinator. It sends two parallel instructions: one telling the GPU to load current-layer data from DRAM into its own memory, and a second telling the storage device to begin pre-fetching the next layer's data from NAND flash into DRAM. The prefetch decision is gated by checking available DRAM capacity, preventing buffer overflow.

The result is that data movement across all three tiers happens in parallel rather than sequentially, which reduces or eliminates the GPU stall time that would otherwise occur between training layers.

What this means for training very large AI models cheaply

GPUs are among the most expensive hardware in any AI training cluster, and they're only valuable when they're actually computing. Every millisecond a GPU spends waiting for data to arrive from slower storage is wasted money. For companies training very large models that can't fit entirely in GPU memory, this kind of intelligent pre-fetching can meaningfully reduce training time and cost.

For Samsung specifically, this patent is strategically interesting because Samsung makes both NAND flash storage and DRAM. A storage device that actively participates in AI training pipelines, rather than just passively handing over data, could become a real product differentiator as AI infrastructure spending grows. Your AI training bill could shrink if the storage drive itself becomes a smarter partner in the process.

Editorial take

This is a solid infrastructure patent with a clear real-world payoff. It's not the kind of thing that makes headlines, but the problem it solves (GPU idle time during large-model training) is genuine and expensive. Samsung is well-positioned to ship this as a hardware-software feature in its enterprise SSD line, where the combination of NAND and DRAM under one roof makes the coordination scheme practical.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.

Samsung Patents Technology That Stops Processors From Stalling While Teaching Computers New Skills

What Samsung's GPU data pre-loading system actually does

How the host coordinates DRAM prefetch and GPU loading

What this means for training very large AI models cheaply

More from Samsung

More in Hardware

Get one Big Tech patent every Sunday