AI Distillation Attacks: The Case for Targeted Government Intervention

Mar 18

AI distillation attacks — systematic efforts to extract capabilities from frontier AI systems to train competitor models — are emerging as a potential pathway for adversaries to accelerate capability development at lower marginal cost. In February 2026, Anthropic, OpenAI and Google published evidence of systematic campaigns by Chinese AI companies to extract capabilities from American frontier models at industrial scale.

Left unaddressed, distillation attacks could erode U.S. technological advantage, weaken the commercial incentives underpinning frontier AI investment, and complicate the effectiveness of existing export control regimes. While industry is already investing in technical defenses, significant gaps in enforcement remain that the private sector likely cannot resolve alone.

This memo offers policy recommendations to support industry efforts to counter distillation attacks: (1) consider adding AI companies conducting unauthorized distillation attacks to the U.S. Bureau of Industry and Security (BIS) Entity List; (2) assess the merits of sanctioning those engaging in or facilitating distillation attacks under the Protecting American Intellectual Property Act of 2022; and (3) explore the development of a National Institute of Standards and Technology (NIST)-led AI Distillation Defense Framework to establish minimum defensive standards across the broader AI ecosystem.

What Are Distillation Attacks?

Distillation attacks involve a malicious actor using a “teacher” model's outputs to train a “student” model. Collecting and training on a frontier model’s outputs — including answers to user queries and the intermediate reasoning steps generated by the model as it proceeds toward them — can assist an attacker in developing a model that approximates the original's capabilities. What distinguishes this from legitimate use is the intent to extract AI capabilities in violation of the providers’ Terms of Service, without the corresponding investment in research, compute, or safety infrastructure. This can be significant. For example, EpochAI data suggests that OpenAI and Anthropic alone have spent over $18 billion cumulatively on R&D compute since 2024.

There is some disagreement on the gravity of the threat. Outputs from a “teacher” model can interact unpredictably with the “student” model’s existing training and do not always lead to improvements, while important methods to increase model capabilities like reinforcement learning may not be assisted by distillation. Regardless of their current efficacy, distillation attacks represent IP theft at scale and their potential for harm warrants a proactive and precautionary response.

Distillation Attacks and America’s AI Export Control Regime

America's semiconductor export controls have imposed meaningful constraints on adversary AI development — but distillation supports attackers in approximating frontier AI capabilities through API access alone, partially circumventing these restrictions. Anthropic's report on the topic, however, notes that conducting distillation attacks at scale still requires substantial compute, meaning export controls can directly constrain attack capacity. Addressing distillation attacks is a component of a defense-in-depth strategy, driving maximum effect from export controls and raising the cost of adversarial capability acquisition across the AI stack.

The Case for a Government Response to Distillation Attacks

Industry defenses to distillation attacks are necessary, but insufficient, for three reasons:

There is clear evidence that Chinese companies are attempting to distill leading frontier AI models into competing ones, including those that could have military applications. Anthropic and OpenAI both identified DeepSeek as conducting distillation attacks. A February 2026 CSET analysis of over 9,000 People’s Liberation Army (“PLA”) requests for proposal in 2023 and 2024 found that the Chinese military is actively seeking to use AI in its operations, including DeepSeek models. An October 2025 Jamestown Foundation report similarly found that DeepSeek models were being used in Chinese military and public security settings. These considerations place distillation attacks squarely within the government's remit.
Detection without credible enforcement is unlikely to alter adversary incentives. The companies perpetrating distillation attacks named in the AI companies’ reports presumably knew that they were violating Terms of Service. They proceeded regardless because the expected cost is trivial relative to the value of the extracted capabilities. The campaigns are also adaptive: Anthropic’s report suggested that within 24 hours of a new model release, MiniMax pivoted to capture its capabilities. Only by imposing meaningful penalties can the government materially change the calculus that currently makes distillation a rational strategy.
Commercial proxy services have become key enablers of distillation attacks, creating a market and coordination failure no single company can resolve. Proxy services profit by enabling unauthorized access to frontier models; their commercial incentives run directly counter to preventing distillation because their revenue scales with volume access (e.g. facilitating systematic distillation attacks). To illustrate this, in February, Anthropic documented “hydra cluster” architectures. These are sprawling networks of fraudulent accounts operated by commercial proxy services that sit outside any company's control. Anthropic identified over 16 million exchanges generated through approximately 24,000 fraudulent accounts (20,000 of which were linked to one proxy network alone) targeting agentic reasoning, tool use, and coding capabilities.

Policy Recommendations

The considerations outlined above suggest the need for targeted government intervention. The recommendations below aim to prevent unauthorized, systemic distillation attacks.

The End-User Review Committee should consider adding adversary AI companies conducting distillation attacks to the Entity List. The Entity List is a U.S. Government-maintained trade restriction list, including individuals, businesses and organizations, which is used to restrict covered entities’ access to sensitive items. The criteria for addition is a “reasonable cause to believe, based on specific and articulable facts” that an entity is involved in activities contrary to U.S. national security or foreign policy interests. Entity List designation would require a BIS license, reviewed under a presumption of denial, for any export, re-export, or in-country transfer of items subject to the Export Administration Regulations, which includes items crucial to AI model development and deployment. In addition, under BIS's Affiliates Rule (effective September 2025 but currently paused until November 2026), any entity 50 percent or more owned directly or indirectly by a listed entity would also be subject to the same restrictions worldwide. Entity List designation signals to the broader AI ecosystem, including cloud providers, chip distributors, equipment vendors, that transacting with the designated entities carries regulatory risk, reinforcing the deterrent effect of sanctions.
The U.S. Government should assess the merits of designating DeepSeek, Moonshot, MiniMax and their principals, together with any other adversary actors engaging in or facilitating AI distillation attacks, under the Protecting American Intellectual Property Act of 2022 (“PAIP Act”). The PAIP Act requires the President to identify foreign persons who have knowingly engaged in, or benefited from, significant theft of trade secrets of U.S. persons where that theft is reasonably likely to result in, or has materially contributed to, a significant threat to U.S. national security or economic stability. Once identified, the President must impose at least five of twelve enumerated sanctions on designated entities, including full blocking sanctions on designated individuals. The first-ever PAIP Act designations were made on February 24, 2026, establishing precedent. Anthropic, OpenAI, and Google have published detailed attribution evidence; extraction of model capabilities in aggregate via distillation attacks may meet the definition of a “trade secret” in18 U.S.C. § 1839 (though this would be a novel interpretation); and the national security nexus is documented through PLA procurement of DeepSeek models. Critically, the PAIP Act's coverage extends to entities that have “provided significant financial, material, or technological support” for the theft, language broad enough to reach any commercial proxy services and API aggregators that facilitated the distillation campaigns.
NIST should explore the development of an AI Distillation Defense Framework to establish minimum standards across the broader AI ecosystem. The framework should include pillars covering access controls, detection and monitoring, prevention and response. While frontier AI companies are already sharing technical indicators of distillation on an ad hoc basis, this framework would provide permanence, broaden participation, and institutionalize information-sharing through confidential channels. The primary beneficiaries would be the broader ecosystem of smaller model providers, cloud platforms, and API aggregators that currently lack the expertise to understand and implement effective defenses — and whose vulnerabilities create systemic risk across the AI stack. Given the evolving threat landscape, the framework should be updated regularly with interim guidance published as appropriate.

Frontier Security

Theo Bearman

AI Distillation Attacks: The Case for Targeted Government Intervention

What Are Distillation Attacks?

Distillation Attacks and America’s AI Export Control Regime

The Case for a Government Response to Distillation Attacks

Policy Recommendations

Export Auditors as Market-Powered Export Enforcement

Highly Autonomous Cyber-Capable Agents: Anticipating Capabilities, Tactics, and Strategic Implications