OpenAI · Filed Jan 23, 2026 · Published Jun 4, 2026 · verified — real USPTO data

OpenAI Patents a Tiled Chip Architecture That Does AI Math Inside Memory

OpenAI is designing its own chip architecture — and it's built around a simple but powerful idea: do the math where the data lives, not somewhere else.

OpenAI Patent: Tiled In-Memory Computing Architecture — figure from US 2026/0154218 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0154218 A1
Applicant OpenAI OpCo, LLC
Filing date Jan 23, 2026
Publication date Jun 4, 2026
Inventors Nawab ALI, Muzaffer KAL, Alexander Almela CONKLIN, Burak ERBAGCI, Cagri ERYILMAZ, Mohammed Elneanaei Abdelmoneem FOUDA
CPC classification 710/22
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Feb 25, 2026)
Parent application is a Continuation of 18789480 (filed 2024-07-30)
Document 20 claims

What OpenAI's in-memory compute tiles actually do

Here's a quick analogy: imagine your calculator had to drive across town to fetch every number before it could add them up. That's roughly what traditional chips do — they shuttle data from memory to a processor, crunch it, then send results back. It's fast, but all that travel adds up in time and power.

OpenAI's patent describes a different approach. Instead of moving data to the processor, the processor comes to the data. Small, self-contained units called compute tiles store the numbers (called weights) they need and do the multiplication right there in memory. A lightweight general-purpose processor sits alongside each tile to handle the trickier math — like the nonlinear functions neural networks need — and to keep things coordinated.

The tiles are designed to talk to each other too, with a built-in conversion engine that translates data between tiles when their formats don't match. Think of it like an automatic language translator sitting on the data highway between chips.

How the CIM tiles, GP processor, and data conversion fit together

The patent describes a tiled compute architecture where each tile is a self-contained AI processing unit. Here's how the pieces fit together:

  • Compute-in-memory (CIM) modules — each engine stores a matrix of neural network weights and performs vector-matrix multiplication (VMM) (the core math operation in AI inference — multiplying an input vector against a weight matrix) directly inside the memory hardware, skipping the usual fetch-and-compute cycle.
  • General-purpose (GP) processor per tile — sits alongside the CIM engines, handles nonlinear operations like activation functions (e.g., ReLU, sigmoid), and orchestrates the engines. Crucially, data can move around the GP processor via bypass paths, so it doesn't become a bottleneck.
  • Addressable compute engines — external data movement initiators can push or pull data directly to/from engines without going through the GP processor, enabling parallel, low-latency data transfers.
  • Data conversion engine — when one tile outputs data in one numeric format and the next tile expects a different one, this converter handles the translation on the fly, letting tiles with different precision or data types work together in a single pipeline.

The multi-tile design means you can chain or array these tiles to scale up compute for larger models.

What this means for the future of AI accelerator chips

The biggest bottleneck in running large AI models isn't the math — it's moving data. Memory bandwidth is expensive in both time and power, and compute-in-memory architectures are one of the most promising ways to close that gap. If OpenAI is patenting this kind of hardware, it signals they're thinking seriously about building custom silicon optimized for their own inference workloads, not just renting GPU time from Nvidia.

For you as a user, the downstream effect would be AI responses that are faster and cheaper to generate — which matters a lot when OpenAI is serving hundreds of millions of queries. The inter-tile data conversion engine is a particularly practical touch: it means you can mix precision levels across a chip, trading off accuracy for speed in layers where it won't hurt quality.

Editorial take

This is a genuinely substantive hardware patent, not a paper exercise. The specific combination of CIM engines, per-tile GP processors, bypass data paths, and a format-conversion bridge between tiles reads like a real chip microarchitecture spec, not a vague concept filing. OpenAI patenting this kind of low-level silicon work is the clearest signal yet that they're serious about owning their own compute stack end-to-end.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.