Google · Filed Nov 19, 2025 · Published May 21, 2026 · verified — real USPTO data

Google Patents an AI Agent That Remembers Past Conversations Using Embeddings

Most AI assistants forget everything the moment you close the chat window. Google is patenting a way for its agents to selectively pull relevant chunks from your entire conversation history — not just the current session.

Google Patent: AI Agent With Long-Term Conversation Memory — figure from US 2026/0140980 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0140980 A1
Applicant Google LLC
Filing date Nov 19, 2025
Publication date May 21, 2026
Inventors Carsten Isert, Patrick Andreas Zoechbauer, Nicolas Charles Yves Gros, Rhomé François Jean-Baptiste Falaize, Hugo Jose Dias Lopes, Cristian Pavel, Shiqi Chen, Manzil Zaheer, Kavya Venkata Kota Sai Kopparapu, Kenneth Daniel Marino, Robert David Fergus, Zun Li, Nishant Yadav, Ian Song Fischer, Kuang-Huei Lee, Veeranjaneyulu Sadhanala, Lauren Haley Beltrone, Ricardo Jose Galbis, Lei Zhong
CPC classification 704/9
Grant likelihood Medium
Examiner CENTRAL, DOCKET (Art Unit OPAP)
Status Docketed New Case - Ready for Examination (Dec 27, 2025)
Parent application Claims priority from a provisional application 63722561 (filed 2024-11-19)
Document 20 claims

What Google's conversation-memory agent actually does

Imagine you spent three weeks planning a trip with an AI assistant — flights, hotels, dietary preferences, the works. Then a month later you ask it a follow-up question, and it acts like it has never met you. That's the current reality with most AI chatbots, and it's genuinely frustrating.

Google's patent describes a system that keeps a running archive of your past conversations, broken into segments. When you ask a new question, the agent checks whether anything from your history is relevant — and if so, quietly pulls those pieces in before crafting its response. You don't have to re-explain yourself.

The clever part is that the system doesn't just dump your entire chat history into every reply (which would be slow and expensive). It uses vector embeddings — mathematical representations of meaning — to find only the most relevant past moments and surface those. It's targeted retrieval, not a memory dump.

How the embedding lookup pulls the right past segments

The patent describes a retrieval-augmented approach to conversation memory for software agents. Here's the core flow:

  • All past conversations are chunked into segments, and each segment is converted into a segment embedding — a high-dimensional vector that encodes its semantic meaning.
  • When a new user query arrives, the system first decides whether the query is likely relevant to any past conversation at all (skipping the retrieval step entirely if not).
  • If relevance is detected, a query embedding is generated from the current question, then compared against stored segment embeddings using similarity search — think nearest-neighbor lookup in vector space.
  • The top-matching past segments are retrieved and their data is passed alongside the current query to generate the final response.

This is essentially RAG (Retrieval-Augmented Generation) — a technique where a language model is given retrieved documents at inference time rather than relying solely on its trained weights — but applied to a user's personal conversation archive instead of a document corpus.

The patent emphasizes both selectivity (only fetching relevant history) and adaptivity (the system can tune how much history to surface). The segment embeddings are precomputed and stored, so retrieval is fast even over very long histories.

What this means for long-running AI assistant sessions

For anyone who uses an AI assistant repeatedly over days or weeks — for work projects, ongoing research, or personal planning — this kind of persistent, searchable memory is a meaningful quality-of-life improvement. Right now, most AI assistants treat every session as a blank slate, forcing you to re-establish context constantly. A system like this could make long-term AI assistants feel genuinely continuous rather than amnesiac.

From a strategic angle, this filing lines up with Google's push to make Gemini a persistent, personalized assistant across its product surface — Gmail, Docs, Search, and beyond. If your AI assistant can recall that you mentioned a budget constraint six conversations ago, or that you prefer a particular coding style, the value of staying inside Google's ecosystem compounds over time. That's a meaningful retention play.

Editorial take

This is a solid, well-scoped engineering patent for a real problem. RAG applied to personal conversation history is an obvious next step for any serious AI assistant, and the selective-retrieval design (only fetching history when relevant) shows Google is thinking about latency and cost, not just capability. The large inventor list — 19 people — suggests this is genuine production engineering, not a paper patent.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.