Tuesday, November 11, 2025

Top 5 This Week

Related News

Lakera launches open-source benchmark to strengthen LLM security in AI agents

Check Point Software Technologies Ltd., Lakera, and researchers from the UK AI Security Institute have jointly developed the Backbone Breaker Benchmark (b3), an open-source framework created to evaluate the security of large language models (LLMs) used in AI agents.

The b3 benchmark focuses on “backbone” LLMs, which serve as the core of AI agents. It identifies their most vulnerable areas without requiring full AI workflow testing. The benchmark introduces an innovative method called “threat snapshots,” which are micro-tests that capture the model’s response at specific points during an attack.

The benchmark integrates ten representative threat snapshots with a high-quality dataset of over 19,000 adversarial attacks. These have been collected through the gamified red teaming platform Gandalf: Agent Breaker, a security simulation game developed by Lakera. The tests assess LLM vulnerability to different attack types, including system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-service, and unauthorised tool usage.

b3 generates measurable and comparable vulnerability scores across models, making it easier to assess LLM security under real-world adversarial conditions. Early findings suggest that models enhanced with reasoning capabilities tend to be more secure. It also shows that model size does not necessarily determine security and that open-weight models are rapidly approaching the security standards of closed-source systems.

This benchmark provides AI developers and model providers with a practical approach to identifying and fixing security weaknesses in backbone LLMs. The complete benchmark and related research are publicly available under an open-source licence, with data derived from one of the world’s largest AI security red teaming communities — Gandalf: Agent Breaker.

The Gandalf: Agent Breaker game challenges players to exploit AI agents in simulated environments, featuring ten AI applications with multiple difficulty levels and varied defences. The idea originated from a Lakera hackathon, where competing teams tested defences against an LLM protecting a secret password. Since 2023, it has evolved into a global AI red teaming community, generating over 80 million data points.

Players use techniques such as prompt injection, memory tampering, and tool manipulation to uncover vulnerabilities, helping professionals gain practical GenAI security experience. The game is aimed at AI engineers, educators, product teams, and cybersecurity professionals seeking to enhance their understanding of AI agent safety in a dynamic learning environment.

Also read: Viksit Workforce for a Viksit Bharat

Do Follow: The Mainstream formerly known as CIO News LinkedIn Account | The Mainstream formerly known as CIO News Facebook | The Mainstream formerly known as CIO News Youtube | The Mainstream formerly known as CIO News Twitter

About us:

The Mainstream formerly known as CIO News is a premier platform dedicated to delivering latest news, updates, and insights from the tech industry. With its strong foundation of intellectual property and thought leadership, the platform is well-positioned to stay ahead of the curve and lead conversations about how technology shapes our world. From its early days as CIO News to its rebranding as The Mainstream on November 28, 2024, it has been expanding its global reach, targeting key markets in the Middle East & Africa, ASEAN, the USA, and the UK. The Mainstream is a vision to put technology at the center of every conversation, inspiring professionals and organizations to embrace the future of tech.

Popular Articles