US government to test new AI models from Google, Microsoft and xAI before public release

0
5
US expands AI safety evaluations with voluntary testing agreements from major tech companies
US expands AI safety evaluations with voluntary testing agreements from major tech companies

The US Department of Commerce will begin testing new artificial intelligence models and capabilities developed by Google, Microsoft Corporation, and xAI before they are released publicly.

The companies have voluntarily agreed to submit their AI systems for evaluation through the Commerce Department’s Center for AI Standards and Innovation (CAISI). The initiative expands earlier agreements signed during the Biden Administration with companies including OpenAI and Anthropic.

Under the program, AI models will undergo evaluations focused on capabilities, security, collaborative research, testing standards, and best practices related to commercial AI systems.

Chris Fall, Director of CAISI, said: “These expanded industry collaborations help us scale our work in the public interest at a critical moment.”

Google’s major AI offering includes Gemini through its DeepMind division, while Microsoft’s leading AI platform is Copilot. xAI’s AI chatbot, Grok, has also gained public attention in recent months over controversies related to image-generation outputs.

CAISI revealed that it has already completed 40 evaluations of AI systems, including testing of several advanced unreleased models. However, the agency did not disclose which models were prevented from public release.

Following the announcement, Microsoft stated in a company blog post that while it already conducts internal AI testing, “testing for national security and large-scale public safety risks necessarily must be a collaborative endeavour with governments.”

A spokesperson from Google DeepMind declined to comment, while representatives from xAI did not respond to media requests.

The broader collaboration signals a shift in the US government’s approach toward AI oversight. The Trump administration had previously supported a relatively hands-off strategy focused on reducing regulatory barriers to accelerate AI development and strengthen US leadership in the sector.

However, growing military use of AI technologies and concerns around increasingly powerful models appear to be influencing policy discussions. Reports also recently highlighted claims from Anthropic that it developed a highly advanced AI model called Mythos, considered too powerful for public release.

The development comes amid ongoing legal and policy debates surrounding AI safety, national security, and government access to advanced AI systems.

Also read: Viksit Workforce for a Viksit Bharat

Do Follow: The Mainstream LinkedIn | The Mainstream Facebook | The Mainstream Youtube | The Mainstream Twitter

About us:

The Mainstream is a premier platform delivering the latest updates and informed perspectives across the technology business and cyber landscape. Built on research-driven, thought leadership and original intellectual property, The Mainstream also curates summits & conferences that convene decision makers to explore how technology reshapes industries and leadership. With a growing presence in India and globally across the Middle East, Africa, ASEAN, the USA, the UK and Australia, The Mainstream carries a vision to bring the latest happenings and insights to 8.2 billion people and to place technology at the centre of conversation for leaders navigating the future.