New Google Patents · Filed Apr 29, 2024 · Published Jun 4, 2026 · verified — real USPTO data

Google's New Patent Teaches AI to Find Drug Candidates Faster

By Patentlyze Team · Updated Jul 10, 2026

Finding a drug that sticks to the right biological target is a needle-in-a-haystack problem — Google is filing patents on ways to teach AI to read the haystack faster.

Figure from the official USPTO publication.

Publication number US 2026/0154604 A1

Applicant GOOGLE LLC

Filing date Apr 29, 2024

Publication date Jun 4, 2026

Inventors Wen TORNG, Steven KEARNES, Stephan HOYER, Kevin MCCLOSKEY, Jin XU, Jianwen FENG, Sharad VIKRAM, Matt HOFFMAN, Brian PATTON

CPC classification 706/12

Grant likelihood Medium

Examiner CENTRAL, DOCKET (Art Unit OPAP)

Status Docketed New Case - Ready for Examination (Mar 2, 2026)

Parent application is a National Stage Entry of PCTUS2022053586 (filed 2022-12-21)

Document 20 claims

AI simulation

What Google's DNA-library drug-screening AI actually does

Imagine you're trying to find a key that fits a very specific lock, but you have millions of possible keys to test. That's essentially what early-stage drug discovery looks like. Scientists use a technique called a DNA-encoded library (DEL) — they attach tiny DNA barcodes to millions of different chemical compounds, mix them all with a biological target (like a protein linked to a disease), and then count which barcodes show up most often after the unbound molecules are washed away. The more times a barcode is counted, the better that compound likely binds.

The problem is that raw barcode count data is noisy and indirect — it doesn't tell you how well a molecule binds, just how often it showed up. Google's patent describes a way to train an AI model that learns to predict binding strength directly, then works backwards to simulate what the barcode count data should look like, and compares that to what was actually observed. The gap between expected and real counts teaches the AI how to get better.

This approach lets the AI learn from messy, real-world experimental data without needing perfectly clean measurements — which is pretty much the situation every drug discovery lab is actually in.

How the Poisson model bridges AI predictions and DNA read counts

The core of this patent is a training pipeline for a graph neural network (GNN) — an AI architecture that treats molecules as graphs, where atoms are nodes and chemical bonds are edges. The GNN predicts a molecule's binding affinity (how strongly it sticks to a target protein) as a single number.

But here's the clever part: instead of directly comparing that predicted affinity to a noisy experimental measurement, the system routes it through a Poisson probabilistic model of the DEL experiment itself. A Poisson model (a statistical tool for predicting counts of rare events) estimates how many DNA reads you'd expect to observe if the molecule truly had that predicted affinity. That expected count is then compared to the actual observed read count from the lab experiment to generate a loss value — a score telling the model how wrong it was.

The patent also describes augmenting training with simulated disynthon data. DEL compounds are typically built from two or three chemical building blocks; a disynthon is a partial compound (two of three blocks). The system can synthesize fake training examples by combining predicted affinities for partial structures, effectively multiplying the useful training signal from a single experiment.

GNN encodes a molecule's graph structure into a predicted affinity score
Affinity is fed into a Poisson model of the DEL process to predict expected read counts
Expected vs. actual read counts generate the training loss
Simulated disynthon examples expand the training data without new experiments

What this means for computational drug discovery pipelines

DEL experiments can screen tens of millions of compounds at once, but the read-count data they produce is famously noisy — a molecule might appear hundreds of times just due to statistical flukes, or be underrepresented despite binding well. Most ML models trained directly on raw read counts end up chasing that noise. By inserting a physics-informed layer that models the experimental process itself, Google's approach teaches the AI to reason about what the data means, not just what it says. That's a meaningful shift in how these models are trained.

For anyone working in computational drug discovery, this matters because it promises better-calibrated binding predictions from the same experimental budget — potentially surfacing stronger lead compounds earlier. Google's DeepMind and Google Research groups have been building credibility in molecular ML for years, and this patent is consistent with a push toward tools pharma companies would actually license or partner on.

Editorial take

This is a genuinely interesting methodological patent, not a flashy product announcement. The idea of embedding a probabilistic model of the experimental process inside the AI training loop — so the model learns from noisy data without being fooled by it — is the kind of careful, principled engineering that separates useful ML from academic benchmarking. Whether Google turns this into a commercial drug discovery platform or keeps it as internal research infrastructure is the open question.

Which company should we read for you?

We track 17 companies here. Pro is the same weekly breakdown for any company you choose, delivered privately. Type a name and we'll scope it and send you a quote.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.

Google's New Patent Teaches AI to Find Drug Candidates Faster

What Google's DNA-library drug-screening AI actually does

How the Poisson model bridges AI predictions and DNA read counts

What this means for computational drug discovery pipelines

More from New Google Patents

More in AI simulation

Get one Big Tech patent every Sunday