Samsung · Filed May 14, 2025 · Published May 28, 2026 · verified — real USPTO data

Samsung Patents a Layered AI System for Precise Robot Motion Control

Teaching a robot to 'clean the table' is easy to say and brutally hard to execute — Samsung's new patent tackles that gap by chaining AI models together to translate a high-level goal into precise, frame-by-frame physical movements.

Samsung Patent: AI-Driven Robot Micro-Action System — figure from US 2026/0145325 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0145325 A1
Applicant Samsung Electronics Co., Ltd
Filing date May 14, 2025
Publication date May 28, 2026
Inventors Inseop CHUNG, Sung Hyun CHUNG, Junho CHO, Kapje SUNG, Jinhyuk CHOI
CPC classification 700/245
Grant likelihood Medium
Examiner NGUYEN, ROBERT T (Art Unit 3619)
Status Docketed New Case - Ready for Examination (Jun 18, 2025)
Document 20 claims

How Samsung's robot brain breaks big tasks into tiny moves

Imagine telling a robot, 'go put the cup in the sink.' That instruction makes total sense to you, but a robot needs to figure out dozens of tiny physical steps — reach, grasp, rotate, move, release — and it needs to re-evaluate constantly as the scene changes.

Samsung's patent describes a system where two AI models work in sequence. The first model looks at the robot's current camera frame alongside the big-picture task and generates a step prompt — essentially a mid-level instruction like 'now grip the handle.' The second model then combines the original task, that step instruction, and the visual frame to decide on a precise micro-action: the exact movement the robot should make right now.

The result is a hierarchy: a broad goal at the top, a situational sub-task in the middle, and a granular physical motion at the bottom. This layered approach means the robot stays responsive to what it actually sees, rather than blindly following a pre-written script.

How the prompt chain drives Samsung's micro-action pipeline

The patent describes a processor-implemented pipeline with three layers of abstraction for robot control.

At the top sits the master prompt — a natural-language description of the overall task (e.g., 'pick up the red block and place it in the bin'). The robot continuously captures frame images — think of these as snapshots from an onboard camera representing the robot's current view of the world.

A prompt generation model (a vision-language model that understands both images and text) takes the master prompt and the current frame image and produces a step prompt — a dynamically generated sub-task description that bridges the gap between the high-level goal and what's happening in the scene right now. This is the system's way of saying 'given where we are, here's the immediate objective.'

Finally, an action generation model receives all three inputs — master prompt, step prompt, and frame image — and outputs a micro-action: a low-level, executable movement command (joint angles, gripper states, velocity vectors, etc.). The patent also references a detokenizer component, suggesting the action output is decoded from a token-based representation, which is consistent with transformer-style architectures being applied to robot control.

What this means for AI-powered robotics and Samsung's ambitions

Robotics has long struggled with the gap between task-level instructions and motor-level execution. Most classical approaches require painstaking hand-coded motion sequences. Using vision-language models to dynamically generate intermediate instructions — and then grounding those in real-time visual context — is exactly the direction the field is moving, and Samsung is staking a claim in that space.

For you as a consumer, this kind of architecture is what makes household robots plausible: a robot that can handle a cluttered counter or an unexpected obstacle because it's replanning at each frame, not just running a fixed program. Samsung has been publicly investing in humanoid and service robotics, and this patent fits squarely into that trajectory.

Editorial take

This is a real and technically coherent contribution to the robot learning pipeline problem — the prompt-chaining approach mirrors what researchers at DeepMind, Google, and Physical Intelligence have been publishing on, so Samsung is at minimum keeping pace with the frontier. Whether this specific two-model hierarchy ends up in a shipping product or gets subsumed by a single end-to-end model is genuinely uncertain, but the direction is right and the filing shows Samsung is thinking seriously about robotics at the AI-architecture level.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.