Tesla · Filed Jan 21, 2026 · Published May 28, 2026 · verified — real USPTO data

Tesla Patents a Parallel 2D Matrix Processor for Accelerated AI Image Math

Tesla is patenting the math engine at the heart of neural network image processing — a two-dimensional matrix processor designed to run massive numbers of multiply-accumulate operations at the same time, in lockstep.

Tesla Patent: Accelerated Mathematical Engine for AI Chips — figure from US 2026/0147540 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0147540 A1
Applicant Tesla, Inc.
Filing date Jan 21, 2026
Publication date May 28, 2026
Inventors Peter Joseph Bannon, Kevin Altair Hurd, Emil Talpes
CPC classification 382/156
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Feb 22, 2026)
Parent application is a Continuation of 18326864 (filed 2023-05-31)

What Tesla's matrix engine actually does to images

Imagine every frame your car's camera captures needs to be analyzed almost instantly — spotting a pedestrian, reading a lane line, flagging a stopped vehicle. That analysis is basically a huge pile of multiplication problems, and doing them one at a time is far too slow.

Tesla's patent describes a chip architecture built to tackle that pile in parallel. Instead of processing image data sequentially, the design lays out a two-dimensional grid of small computing units, each handling its own slice of the math at the same time.

Each unit in the grid carries its own arithmetic logic unit (the part that does the actual math), plus two registers — one to hold the current result and a "shadow" register to stage the next batch of data. A shared clock keeps everything synchronized, so image pixels and their associated weights get multiplied in unison across the whole grid.

How the 2D subcircuit grid crunches convolutions

The core idea is convolution acceleration — the specific type of math that convolutional neural networks (CNNs) use to detect features in images. A convolution slides a small grid of numbers (called a kernel or filter) across an image and multiplies each pixel value by the corresponding weight, then sums the results. Do that millions of times per frame, across many filters, and you have a serious compute problem.

Tesla's solution is a two-dimensional matrix processor made up of subcircuits arranged in a grid. Each subcircuit contains:

  • An ALU (Arithmetic Logic Unit) — the tiny calculator that does one multiply-add operation
  • An output register — holds the result of the current computation
  • A shadow register — pre-loads the next set of data so the ALU never has to wait

The shadow register trick is particularly notable. It's a classic double-buffering approach: while the ALU is crunching one batch of numbers, the shadow register is quietly receiving the next batch. When the clock ticks, roles swap instantly — zero idle time.

The entire grid runs off a synchronized clock, meaning image data and weight values are fed in and multiplied across all subcircuits simultaneously. This is what enables the "large number of mathematical operations in parallel" the abstract promises.

What this means for Tesla's in-house AI chip ambitions

Tesla designs its own AI inference chips — the Full Self-Driving (FSD) chip and its successor the D1 (used in Dojo, their training supercomputer). Patents like this one offer a window into the low-level architectural decisions behind those chips. A clocked, 2D matrix processor with shadow registers is exactly the kind of silicon-level detail that determines how fast a neural network can process camera frames on a moving vehicle.

Beyond cars, Tesla has said it intends to sell AI compute capacity externally. If their matrix engine achieves better throughput-per-watt than off-the-shelf GPU solutions, that matters for both the cost of running FSD and any future data center product. The inventors — Peter Bannon, Kevin Hurd, and Emil Talpes — are veterans of Tesla's chip team with deep roots in Apple's silicon group, lending credibility to the seriousness of this design.

Editorial take

This is unglamorous but genuinely important foundational work. Matrix processors are the beating heart of every AI inference chip, and the specific choices Tesla makes here — 2D layout, shadow registers, synchronous clocking — directly determine the performance ceiling of their neural nets at the edge. The pedigree of the inventors (ex-Apple silicon architects) suggests this isn't exploratory research; it's production-track chip engineering.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.