AI model Claude identifies 22 security flaws in Firefox during two-week testing The Mainstream

Anthropic revealed that its frontier AI model Claude Opus 4.6 discovered 22 vulnerabilities in the Mozilla Firefox browser during a 2-week collaboration with Mozilla. Researchers said 14 of these issues were classified as high-severity vulnerabilities.

According to the company, the partnership aimed to test the model’s ability to detect real-world security risks in complex software systems. Before the experiment, Claude had already performed strongly in the CyberGym benchmark, solving nearly all assigned tasks.

Researchers selected Firefox for the trial due to its complex codebase and strong security standards. “We chose Firefox because it’s both a complex codebase and one of the most well-tested and secure open-source projects in the world. This makes it a harder test of AI’s ability to find novel security vulnerabilities than the open-source software we previously used to test our models,” the research team said.

To prepare the AI model, the team first created a dataset containing older Firefox Common Vulnerabilities and Exposures (CVEs). Claude successfully reproduced a large share of these historical vulnerabilities. After this step, the model was asked to identify new vulnerabilities in the latest version of the browser.

The experiment initially focused on Firefox’s JavaScript engine, but researchers later expanded the testing to other parts of the browser software. During the analysis, Claude reviewed nearly 6,000 C++ files and generated 112 unique reports.

Each report was carefully verified by researchers before being submitted to Mozilla. After validation, 22 vulnerabilities were confirmed, including 14 high-severity bugs.

Anthropic stated that most of the issues have already been fixed through the Firefox 148 update, while the remaining vulnerabilities are expected to be resolved in upcoming releases.

The collaboration has also influenced how Mozilla approaches security. The organisation has started using Claude internally to support vulnerability detection.

Researchers added that the entire experiment required $4,000 (approximately ₹3,69,200) in API credits to run the AI model during the testing period.

Also read: Viksit Workforce for a Viksit Bharat

Do Follow: The Mainstream LinkedIn | The Mainstream Facebook | The Mainstream Youtube | The Mainstream Twitter

About us:

The Mainstream is a premier platform delivering the latest updates and informed perspectives across the technology business and cyber landscape. Built on research-driven, thought leadership and original intellectual property, The Mainstream also curates summits & conferences that convene decision makers to explore how technology reshapes industries and leadership. With a growing presence in India and globally across the Middle East, Africa, ASEAN, the USA, the UK and Australia, The Mainstream carries a vision to bring the latest happenings and insights to 8.2 billion people and to place technology at the centre of conversation for leaders navigating the future.

Top 5 This Week

Apple increases iPhone production in India by 53% as country assembles entire iPhone 17 lineup

Vivo introduces V70 FE with 7,000mAh battery and 200MP camera in Indonesia

Samsung adds automatic inactivity restart feature to strengthen Galaxy smartphone security