The Rise of Industrial Scale Model Extraction.

Anthropic revealed that Chinese AI firms used "hydra clusters" and millions of prompts to illicitly distill Claude’s reasoning and coding capabilities, bypassing safeguards and posing significant global security risks.

Yiannis Bakopoulos assisted by Gemini AI, Source "The Hacker News"

2/24/20262 min read

The Rise of Industrial-Scale Model Extraction

Anthropic recently disclosed the discovery of "industrial-scale" distillation campaigns conducted by three Chinese AI firms: DeepSeek, Moonshot AI, and MiniMax. These companies allegedly bypassed regional restrictions and terms of service to systematically drain the capabilities of the Claude LLM to bolster their own proprietary models.

Understanding the Mechanics: Distillation vs. Extraction

In the AI industry, distillation is a standard process in which a smaller, "student" model is trained on the outputs of a larger, "teacher" model. While legitimate when used internally to create faster, cheaper versions of one’s own tech, it becomes model extraction when a competitor uses it to "steal" the reasoning and logic of a frontier model at a fraction of the original R&D cost.

The scale of these attacks was unprecedented:

  • Total Exchanges: Over 16 million prompts.

  • Infrastructure: Approximately 24,000 fraudulent accounts.

  • Methodology: The use of "hydra cluster" architectures—massive proxy networks that rotate accounts to avoid detection and blend illicit traffic with legitimate user requests.

Targeted Capabilities and Strategic Goals

The campaigns were not random; they targeted Claude’s most sophisticated and "differentiated" features.

CompanyFocus AreaScaleDeepSeekReasoning, rubric grading, and creating "censorship-safe" responses for sensitive political topics.150,000+ exchangesMoonshot AIAgentic reasoning, tool use, coding, and computer vision.3.4 million exchangesMiniMaxSpecialized agentic coding and tool-use capabilities.13 million+ exchanges

National Security and Safety Risks

Anthropic warned that these illicitly distilled models pose a unique threat to national security. Because distillation captures the "intelligence" of a model without inheriting its safety filters or constitutional guardrails, these secondary models may possess dangerous capabilities (e.g., assisting in offensive cyber operations or spreading disinformation) without the original model's built-in protections.

Furthermore, these tools can be weaponized by authoritarian regimes for mass surveillance and military intelligence, effectively using American innovation to fuel foreign strategic interests.

Defensive Measures and Industry Context

To combat this, Anthropic has implemented several advanced countermeasures:

  • Behavioral Fingerprinting: Using AI classifiers to identify the specific "texture" of distillation prompts, which differ significantly from human conversation.

  • Verification Hardening: Stricter vetting for educational and startup accounts often used as fronts.

  • Output Modification: Adjusting model responses to make them less effective for training purposes without degrading the experience for human users.

Anthropic’s report follows a similar disclosure by Google Threat Intelligence, which recently disrupted a campaign targeting the Gemini model. While these attacks don't typically threaten individual users' data privacy, they represent a high-stakes "intellectual property arms race" among global AI developers.

Source: https://thehackernews.com/2026/02/anthropic-says-chinese-ai-firms-used-16.html