US Warns of Coordinated AI Model Extraction Campaigns by Foreign Adversaries

"AI distillation, when legitimately used to produce smaller, lighter-weight models from more advanced systems, is a vital part," wrote Michael Kratsios, Director of the White House Office of Science and Technology Policy, in a memorandum that sets a sharper federal focus on what it calls coordinated, industrial-scale theft of U.S. AI capabilities.

Michael Kratsios's memorandum and the administration's directive

In a Thursday memorandum, OSTP Director Michael Kratsios warned that the U.S. government has evidence that foreign adversaries are conducting coordinated campaigns to distill frontier U.S. AI models. The memo tasks federal agencies with working with the private sector to develop best practices to identify, mitigate and remediate what it calls "industrial-scale distillation activities." It also signals an intention to treat unauthorized model extraction as a form of intellectual property exploitation with national security implications, although the memo does not spell out specific penalties.

"Tens of thousands" of accounts and iterative prompt engineering

The memorandum describes campaigns that leverage "tens of thousands" of distributed accounts to evade detection and rate limits. Those distributed probing efforts are paired with iterative prompt engineering designed to expose both model behavior and the "underlying system logic." Officials warned that attackers do not need to fully replicate frontier models to be effective: by approximating performance on targeted tasks or benchmarks, adversaries can commercialize derivative systems while sidestepping the time and cost of original development.

Jailbreak techniques and leaking internal alignment signals

The memo details the role of jailbreak techniques in exposing restricted model outputs, and it calls for new guardrails that can withstand adversarial prompting without leaking sensitive capabilities or "internal alignment signals." That language reflects concern that standard user-facing constraints can be bypassed and that such bypasses could reveal how models are aligned or why they behave as they do.

Planned defenses: expanded telemetry, tightened identity controls, and real‑time detection

Officials outlined a likely defense posture centered on three technical priorities. First, expanding telemetry and logging around model interactions to provide a richer record of probing activity. Second, tightening identity and access controls for high‑risk users to make large, distributed probing campaigns harder to pull off. Third, building detection systems capable of flagging distributed probing campaigns in real time. The administration is urging agencies and companies to work together on developing these capabilities and the best practices to operationalize them.

What this means for federal agencies, private‑sector security teams, and foreign adversaries

Federal agencies: Agencies have been tasked to coordinate with the private sector to develop remediation and mitigation best practices for "industrial-scale distillation activities."
Private‑sector security and development teams: Companies will be expected to expand telemetry and logging, tighten identity and access controls for high‑risk users, and deploy detection systems that can identify distributed probing in real time—while also engineering guardrails to resist jailbreak-style prompting.
Foreign adversaries: The memo asserts that such actors are using proxy accounts and jailbreaking techniques to extract capabilities and that the Trump administration will explore "a range of measures" to hold them accountable; it also notes attackers can commercialize approximations without fully replicating frontier models.

Downstream risks from distilled systems and the balance with legitimate distillation

Kratsios's memorandum draws a distinction between legitimate model distillation and what the administration calls abusive industrial distillation. It states that legitimate distillation — producing "smaller, lighter-weight models from more advanced systems" — is "a vital part" of the AI ecosystem. By contrast, distilled systems produced through unauthorized extraction can lack the safeguards built into U.S. models, increasing downstream risk because they may not include controls intended to enforce "neutrality, reliability and safe use." The memo therefore links technical defenses, access controls and accountability measures as a package aimed at both deterring theft and preserving the benefits of legitimate model compression and reuse.

The memorandum sets a clear agenda but leaves a key element unresolved: it signals that the White House will treat unauthorized model extraction as IP exploitation with national security implications and that the administration will explore "a range of measures" to hold foreign actors accountable, yet it does not list specific penalties or enforcement authorities. The next phase will likely be concrete guidance from agencies and collaborative standards from industry that operationalize the telemetry, identity controls and detection systems the memo prescribes — and a public moment when the administration must specify which accountability measures it will pursue.

Read the original memorandum: https://www.govinfosecurity.com/white-house-warns-ai-model-extraction-campaigns-a-31502