Google, Microsoft and xAI Just Put Their Next Models in Washington's Test Lab

Washington just pulled more of the frontier AI market into its testing room. On May 5, the Center for AI Standards and Innovation, or CAISI, announced new agreements with Google DeepMind, Microsoft, and xAI to evaluate advanced AI models before they are publicly released.

The official NIST announcement says the expanded collaborations add pre-deployment evaluations and targeted research to assess frontier AI capabilities and improve AI security. OpenAI and Anthropic already had partnerships with the center, and those agreements have been renegotiated to align with Commerce Department direction and the administration's AI Action Plan.

The Pre-Release Channel

The headline is not that the government gets to read a model card after launch. CAISI says these agreements allow government evaluation of AI models before they are publicly available, plus post-deployment assessment and other research. That turns the evaluation process into something closer to a pre-release channel for the most capable systems.

Reuters reported the same basic structure: Microsoft, Google, and xAI will give the U.S. government early access to new AI models for national security testing before public release. The companies now join OpenAI and Anthropic in a growing framework where the most important models are examined by federal evaluators before the public sees them.

The Testing Bargain

The bargain is simple but politically loaded. AI labs get a government-backed testing partner and a channel for national-security feedback. The government gets earlier visibility into systems that may affect cybersecurity, biosecurity, autonomous research, and the balance of international AI competition.

CAISI says it has completed more than 40 evaluations so far, including on state-of-the-art models that remain unreleased. That is the number that makes the announcement more than symbolic. There is already an evaluation pipeline; the new agreements make it broader and more central.

Participant	What Changed	Why It Matters
Google DeepMind, Microsoft, xAI	New CAISI agreements	Pre-release model testing expands across more frontier developers
OpenAI, Anthropic	Existing partnerships renegotiated	The earlier evaluation framework is being aligned with current policy priorities
CAISI and interagency evaluators	More access to models and research	The government gains a clearer view of frontier capabilities before deployment

Why Safeguards Come Off

The most important detail in the NIST release is easy to miss: developers frequently provide CAISI with models that have reduced or removed safeguards so evaluators can thoroughly assess national-security-related capabilities and risks.

That is uncomfortable but necessary. A locked-down public chatbot may not reveal what a model can do in the hands of a determined attacker, a state actor, or a sophisticated red team. Testing the less-restricted version gives evaluators a better shot at measuring cyber risk, tool-use behavior, and unexpected capabilities before those systems scale into the market.

What This Does Not Do

This is not a formal licensing regime. The announcement does not say CAISI can block a model launch, and it describes the agreements as collaborations that support testing, information-sharing, voluntary product improvements, best-practice development, and clearer government understanding.

That distinction matters. The United States is still trying to avoid a heavy pre-approval system while gaining enough early visibility to avoid being surprised by its own frontier labs. It is oversight by access, not yet oversight by veto.

The Release Gate

The practical effect may still be powerful. If every serious frontier developer gives CAISI pre-release access, skipping the process starts to look unusual. Enterprise buyers, federal agencies, and international partners can begin treating CAISI evaluation as part of the credibility stack around a model release.

Microsoft made a similar trust argument in its own May 5 statement, saying ongoing testing is essential to confidence in advanced AI systems and announcing evaluation agreements with both CAISI in the U.S. and the AI Security Institute in the U.K. The signal is broader than one agency: frontier model releases are becoming international security events, not just product launches.

Google, Microsoft and xAI Just Put Their Next Models in Washington's Test Lab

The Pre-Release Channel

The Testing Bargain

Why Safeguards Come Off

What This Does Not Do

The Release Gate

AI-Generated Content

More from Sonarlink

AI Swarms Can Now Hijack Democracy - and Nobody Would Notice

Google's Reported Pentagon Deal Tests AI Control

Anthropic's Most Restricted AI Model Got Breached on Day One