As artificial intelligence models advance faster than existing testing methods, researchers are facing growing challenges in keeping safety evaluations current. To address this gap, Anthropic has released Bloom, a new open source agentic framework built to quickly generate and run behavioural evaluations for frontier AI models at scale.
Bloom allows researchers to define a specific behaviour and automatically measure how often and how strongly it appears across a wide range of generated scenarios. This approach cuts down the time needed to assess model behaviour and helps teams keep evaluations aligned with rapidly evolving systems.
The framework has already been tested on 16 frontier AI models and has shown strong agreement with human labelled judgments. According to Anthropic, Bloom can reliably differentiate between baseline models and intentionally misaligned variants. This level of accuracy makes it useful for both research focused alignment studies and applied AI safety work.
Bloom is designed to work alongside Petri, another open source evaluation tool released earlier by Anthropic. While Petri analyses broad behavioural patterns through multi turn conversations, Bloom focuses on one behaviour at a time. It creates targeted evaluation sets that measure how a single behaviour shows up across different contexts. This makes it especially effective for identifying narrow but high risk issues that broader tests may overlook.
Anthropic developed Bloom to overcome the limits of traditional alignment evaluations, which often take weeks to design and can become outdated once models learn the test patterns. With Bloom, evaluations that once required weeks can now be created and completed in a few days, helping researchers move at the same pace as model development.
Bloom runs through 4 automated stages. These include understanding the target behaviour, creating relevant scenarios, running evaluations at scale, and judging model responses. The framework also integrates with experimentation platforms such as Weights and Biases, allowing repeatable testing and large scale analysis. Test results show that Bloom’s judge models closely match human evaluations and consistently separate misaligned variants from production systems.
Early users are applying Bloom to study risks such as jailbreak susceptibility, self preferential bias, and long horizon sabotage. These early results indicate that Bloom could become a core tool for scalable AI alignment research as systems continue to grow in complexity.
Also read: Viksit Workforce for a Viksit Bharat
Do Follow: The Mainstream formerly known as CIO News LinkedIn Account | The Mainstream formerly known as CIO News Facebook | The Mainstream formerly known as CIO News Youtube | The Mainstream formerly known as CIO News Twitter
About us:
The Mainstream is a premier platform delivering the latest updates and informed perspectives across the technology business and cyber landscape. Built on research-driven, thought leadership and original intellectual property, The Mainstream also curates summits & conferences that convene decision makers to explore how technology reshapes industries and leadership. With a growing presence in India and globally across the Middle East, Africa, ASEAN, the USA, the UK and Australia, The Mainstream carries a vision to bring the latest happenings and insights to 8.2 billion people and to place technology at the centre of conversation for leaders navigating the future.



