Anthropic Accuses Chinese AI Labs of Industrial-Scale Model Theft

Anthropic publicly accused three Chinese AI companies of running coordinated campaigns to illegally extract capabilities from its Claude model. According to the company, DeepSeek, Moonshot AI, and MiniMax created over 24,000 fraudulent accounts and generated more than 16 million exchanges designed to train their own models on Claude's outputs.

The disclosure, published on Anthropic's blog on February 23, 2026, arrives as U.S. policymakers debate stricter export controls on AI chips and model weights. All three accused companies are based in China, where Anthropic's services are formally prohibited due to legal and security concerns.

What Is Model Distillation?

Distillation attacks involve systematically querying a more capable AI model and using its outputs to train a smaller, cheaper model. The technique lets attackers skip the expensive compute and data collection required to build frontier capabilities from scratch.

It's roughly analogous to a student copying an expert's homework answers to pass a test. The student learns nothing about the underlying subject, but the answers look correct.

Distillation isn't inherently malicious. Researchers legitimately use it to create efficient models for deployment on mobile devices or in resource-constrained environments. What makes these campaigns different is scale, intent, and terms-of-service violations.

Attack Scale and Targets

Anthropic tracked distinct patterns across the three companies:

DeepSeek: More than 150,000 exchanges focused on foundational logic and alignment, specifically around generating "censorship-safe alternatives to policy-sensitive queries." This suggests efforts to replicate Claude's response patterns while bypassing its safety guidelines.

Moonshot AI: Over 3.4 million exchanges targeting agentic reasoning and tool use, coding and data analysis, computer-use agent development, and computer vision capabilities. The breadth indicates systematic capability extraction.

MiniMax: Part of the broader campaign though specific exchange counts weren't disclosed.

Combined, the three companies generated 16 million total exchanges through 24,000 fraudulent accounts. The attacks specifically targeted Claude's most differentiated features: agentic reasoning, tool use, and coding.

National Security Implications

Anthropic explicitly framed the distillation attacks as a national security concern in statements to TechCrunch, warning that illicitly distilled models may lack the safety guardrails that U.S. providers implement.

The company noted distilled models could enable authoritarian governments to deploy frontier AI for offensive cyber operations, disinformation campaigns, and mass surveillance. Without inheriting the original model's safety training, distilled versions might readily assist with tasks that Claude refuses.

This concern isn't hypothetical. We've tracked how threat actors increasingly leverage AI assistance for attack planning, including one campaign where actors used both DeepSeek and Claude to generate exploitation code.

How Anthropic Detected the Attacks

The disclosure didn't detail specific detection methods, but the company referenced pattern analysis across usage metrics. Suspicious indicators likely included:

Accounts with unusual query patterns optimized for capability extraction rather than normal use
Systematic coverage of specific capability domains
Queries structured to elicit maximum information per exchange
Geographic access patterns inconsistent with account registration

Anthropic has reportedly terminated the fraudulent accounts and implemented additional detection capabilities to identify similar campaigns.

The Bigger Picture

The accusations land amid escalating tensions over AI development between the U.S. and China. Export controls on advanced chips aim to slow Chinese AI progress, but distillation attacks represent an alternative path to acquiring capabilities without the restricted hardware.

For organizations building AI applications, the incident highlights why model security extends beyond prompt injection and jailbreaks. Companies hosting valuable models must consider systematic capability extraction as a threat vector.

The European Parliament's IT department recently blocked Microsoft Copilot AI features on work devices over similar concerns about sensitive data flowing through AI systems. As AI capabilities become strategic assets, expect both protective measures and extraction attempts to intensify.

Anthropic Accuses Chinese AI Labs of Industrial-Scale Model Theft

What Is Model Distillation?

Attack Scale and Targets

National Security Implications

How Anthropic Detected the Attacks

The Bigger Picture

Related Articles

Attackers Scan for Exposed Self-Hosted Anthropic Models

Chinese APT Calypso Deploys Showboat and JFMBackdoor Against Telecoms

Webworm APT Deploys Discord, MS Graph Backdoors Against Europe

Anthropic's Mythos Toggle Appears in Claude Code—Then Vanishes

Related Articles

Threat Intelligence5 min read
Attackers Scan for Exposed Self-Hosted Anthropic Models
SANS ISC detects reconnaissance activity targeting locally hosted Claude API endpoints. Researchers warn of growing risk from misconfigured AI deployments.
Feb 2, 2026

Threat Intelligence4 min read
Chinese APT Calypso Deploys Showboat and JFMBackdoor Against Telecoms
China-linked Calypso group targets telecoms across Middle East and Asia Pacific with new Linux and Windows malware. Showboat provides SOCKS5 proxy access; JFMBackdoor enables full system control.
May 22, 2026

Threat Intelligence5 min read
Webworm APT Deploys Discord, MS Graph Backdoors Against Europe
ESET exposes Webworm's EchoCreep and GraphWorm backdoors targeting European governments. The China-aligned APT uses Discord and OneDrive for C2, hitting Belgium, Italy, Poland, and Spain.
May 21, 2026

Announcements4 min read
Anthropic's Mythos Toggle Appears in Claude Code—Then Vanishes
A toggle for claude-mythos-1-preview briefly surfaced in Claude Code before removal. The restricted model found 10,000+ zero-days in its first month through Project Glasswing.
May 25, 2026