Anthropic Accuses Chinese AI Labs of Industrial-Scale Model Theft
Anthropic alleges DeepSeek, Moonshot AI, and MiniMax used 24,000 fake accounts to extract Claude capabilities through 16 million distillation queries.
Anthropic publicly accused three Chinese AI companies of running coordinated campaigns to illegally extract capabilities from its Claude model. According to the company, DeepSeek, Moonshot AI, and MiniMax created over 24,000 fraudulent accounts and generated more than 16 million exchanges designed to train their own models on Claude's outputs.
The disclosure, published on Anthropic's blog on February 23, 2026, arrives as U.S. policymakers debate stricter export controls on AI chips and model weights. All three accused companies are based in China, where Anthropic's services are formally prohibited due to legal and security concerns.
What Is Model Distillation?
Distillation attacks involve systematically querying a more capable AI model and using its outputs to train a smaller, cheaper model. The technique lets attackers skip the expensive compute and data collection required to build frontier capabilities from scratch.
It's roughly analogous to a student copying an expert's homework answers to pass a test. The student learns nothing about the underlying subject, but the answers look correct.
Distillation isn't inherently malicious. Researchers legitimately use it to create efficient models for deployment on mobile devices or in resource-constrained environments. What makes these campaigns different is scale, intent, and terms-of-service violations.
Attack Scale and Targets
Anthropic tracked distinct patterns across the three companies:
DeepSeek: More than 150,000 exchanges focused on foundational logic and alignment, specifically around generating "censorship-safe alternatives to policy-sensitive queries." This suggests efforts to replicate Claude's response patterns while bypassing its safety guidelines.
Moonshot AI: Over 3.4 million exchanges targeting agentic reasoning and tool use, coding and data analysis, computer-use agent development, and computer vision capabilities. The breadth indicates systematic capability extraction.
MiniMax: Part of the broader campaign though specific exchange counts weren't disclosed.
Combined, the three companies generated 16 million total exchanges through 24,000 fraudulent accounts. The attacks specifically targeted Claude's most differentiated features: agentic reasoning, tool use, and coding.
National Security Implications
Anthropic explicitly framed the distillation attacks as a national security concern in statements to TechCrunch, warning that illicitly distilled models may lack the safety guardrails that U.S. providers implement.
The company noted distilled models could enable authoritarian governments to deploy frontier AI for offensive cyber operations, disinformation campaigns, and mass surveillance. Without inheriting the original model's safety training, distilled versions might readily assist with tasks that Claude refuses.
This concern isn't hypothetical. We've tracked how threat actors increasingly leverage AI assistance for attack planning, including one campaign where actors used both DeepSeek and Claude to generate exploitation code.
How Anthropic Detected the Attacks
The disclosure didn't detail specific detection methods, but the company referenced pattern analysis across usage metrics. Suspicious indicators likely included:
- Accounts with unusual query patterns optimized for capability extraction rather than normal use
- Systematic coverage of specific capability domains
- Queries structured to elicit maximum information per exchange
- Geographic access patterns inconsistent with account registration
Anthropic has reportedly terminated the fraudulent accounts and implemented additional detection capabilities to identify similar campaigns.
The Bigger Picture
The accusations land amid escalating tensions over AI development between the U.S. and China. Export controls on advanced chips aim to slow Chinese AI progress, but distillation attacks represent an alternative path to acquiring capabilities without the restricted hardware.
For organizations building AI applications, the incident highlights why model security extends beyond prompt injection and jailbreaks. Companies hosting valuable models must consider systematic capability extraction as a threat vector.
The European Parliament's IT department recently blocked Microsoft Copilot AI features on work devices over similar concerns about sensitive data flowing through AI systems. As AI capabilities become strategic assets, expect both protective measures and extraction attempts to intensify.
Related Articles
Attackers Scan for Exposed Self-Hosted Anthropic Models
SANS ISC detects reconnaissance activity targeting locally hosted Claude API endpoints. Researchers warn of growing risk from misconfigured AI deployments.
Feb 2, 2026Claude Code Flaws Let Malicious Repos Steal API Keys, Run Code
Check Point found CVE-2025-59536 and CVE-2026-21852 in Anthropic's Claude Code. Opening a cloned repo could execute code and leak API credentials.
Feb 26, 2026GPT-OSS-Safeguard Models Fail Multi-Turn Jailbreak Testing
Cisco AI Defense research finds OpenAI's safeguard models perform worse than standard versions under sustained attack. Multi-turn jailbreaks spike success rates up to 92%.
Feb 19, 2026Dell Zero-Day Exploited by Chinese Hackers Since 2024
Chinese threat group UNC6201 exploited a critical hardcoded credential flaw (CVE-2026-22769) in Dell RecoverPoint for 18 months before disclosure. Patch now.
Feb 18, 2026