IBM · Filed Dec 20, 2024 · Published Jun 25, 2026 · verified — real USPTO data

IBM Patents a Tool That Spots File-Encrypting Attacks by Sorting Data Into Groups First

Most ransomware detectors scan every storage volume the same way, whether it holds billing records or raw video files. IBM's new patent proposes sorting volumes into behavioral clusters first, then tailoring detection to each group.

IBM Patent: Grouped Ransomware Detection in Storage Systems — figure from US 2026/0178736 A1
FIG. 1A — rendered from the official USPTO publication PDF.
Publication number US 2026/0178736 A1
Applicant International Business Machines Corporation
Filing date Dec 20, 2024
Publication date Jun 25, 2026
Inventors Charalampos Pozidis, Dionysios Diamantopoulos, Roman Alexander Pletka, Yves Alexandre Beraldo dos Santos, Andrew D. Walls
CPC classification 726/23
Grant likelihood Medium
Examiner WILLIAMS, JEFFERY L (Art Unit 2495)
Status Docketed New Case - Ready for Examination (Jan 30, 2025)
Document 20 claims

How IBM's grouped storage scanning catches ransomware faster

Imagine a hospital with thousands of filing cabinets. Some hold patient charts, some hold billing records, some hold scanned X-rays. A security guard checking all of them the same way would either miss threats in the unusual ones or waste time raising false alarms in the normal ones. IBM's patent tackles the same problem in data storage.

The system watches how storage volumes behave over time and groups together the ones that act alike. A folder full of files that rarely change will sit in one group; a busy database that rewrites data constantly will sit in another. Each group gets monitored against its own normal baseline.

When ransomware hits, it tends to scramble files in ways that look wildly out of place for that specific group's habits. By comparing activity against a group-level baseline rather than a single universal rule, the system can spot the anomaly faster and with fewer false alarms.

How vector clustering drives the detection logic

The patent describes a method for organizing a storage system's volumes into clusters based on shared behavioral characteristics, then running ransomware detection at the cluster level rather than volume by volume.

The core steps are:

  • Each storage volume is converted into a vector (a list of numbers representing its characteristics, such as read/write frequency, data entropy, and file-change patterns).
  • Volumes with similar vectors are grouped together using clustering, a standard machine-learning technique that finds natural groupings without being told in advance what those groups should look like.
  • Those groups are tracked over time inside a multi-dimensional vector space (essentially a mathematical map where similar volumes sit close together).
  • Ransomware detection is then applied at the group level, so the system learns what "normal" looks like for each cluster and flags deviations from that cluster's baseline.

The practical effect is that context-aware detection replaces one-size-fits-all scanning. A volume full of frequently rewritten database logs won't trigger alerts just because it rewrites data constantly; that's expected behavior for its group.

What this means for enterprise storage security

Enterprise storage systems can hold thousands of volumes with completely different usage patterns. Applying the same detection rules to all of them is how defenders end up with either alert fatigue (too many false positives) or blind spots (anomalies that don't cross a generic threshold). Clustering-first detection is a practical fix for both problems.

For large organizations running IBM storage infrastructure, this approach could mean faster response times when ransomware actually strikes, because the signal-to-noise ratio improves. It also scales well: as new volumes are added, they get assigned to an existing cluster or form a new one, rather than requiring manual rule-writing.

Editorial take

This is solid, practical security engineering rather than a flashy headline grab. Clustering before scanning is a sensible architectural choice that addresses a real weakness in current detection systems. It's not the kind of patent that signals a brand-new product category, but it's the kind that ends up in production software.

Get one Big Tech patent every Sunday

Plain English, intelligent commentary, no hype. Free.

Source. Full patent text and figures from the official USPTO publication PDF.

Editorial commentary on a publicly published patent application. Not legal advice.