Mindgard launches GuardBuster to test AI guardrails

Mon, 1st Jun 2026 (Today)

Mindgard has launched GuardBuster, a product designed to test how AI guardrails hold up under real-world attack conditions.

The new offering is aimed at organisations using AI systems, agents, copilots and large language model applications that rely on guardrails to limit prompt injection, jailbreaks and data leakage.

GuardBuster is designed to assess defences against adaptive adversarial behaviour, rather than benchmark prompts used in controlled testing. Mindgard said many existing guardrail assessments are based on narrow scenarios that do not reflect attackers' changing tactics.

The launch comes as companies face growing pressure to show that controls around AI systems work outside laboratory conditions. In practice, security teams must judge whether protections that perform well in tests will still hold when users or attackers manipulate context, fragment instructions or use other evasive methods.

GuardBuster combines Mindgard's platform, research and adversarial AI security work to evaluate both guardrails and AI gateways. The tool examines systems under agentic and realistic attack conditions, including psycho-analytical coercion, subtle prompt injection and jailbreaking, character-level evasion, adversarial machine learning evasion, multi-turn manipulation and contextual obfuscation.

That focus reflects a broader concern in the AI security market over vendor-reported benchmark scores. Buyers often receive accuracy rates or performance figures from suppliers, but those measures may not show how a control behaves when an attacker adapts in real time.

Mindgard argued that this leaves enterprise buyers and builders without enough independent evidence to judge whether a guardrail is effective. It also said guardrails need frequent re-evaluation because the context around AI deployments and the surrounding threat environment can change quickly.

According to Mindgard's research, large language model guardrail systems have significant blind spots against real-world attacks. The company said current prompt injection and jailbreak detection systems can be evaded, and that large language models remain vulnerable to those threats.

Aaron Portnoy, Chief Product Officer at Mindgard, said the issue is not simply whether a safeguard exists, but whether organisations can verify its performance.

"If an organization invests in a guardrail, but cannot measure it effectively, they're facing a gap that still must be addressed," said Aaron Portnoy, Chief Product Officer at Mindgard.

He added that organisations need more detailed evidence on how controls behave under stress.

"The AI ecosystem needs independent validation that shows not just whether a control passes or fails, but what type of attacks it can stop, how systems respond under adversarial pressure, and where defenses begin to break down. With this offering, Mindgard acts as the complement to any guardrail, enabling organizations to validate their security investments with proven value, and empowering customers to push back on vendors who aren't delivering quality assessments," Portnoy said.

Testing gap

The debate over AI guardrails has intensified as more companies build customer-facing and internal tools on top of foundation models. Many of those systems depend on safeguards to block unsafe responses, protect sensitive information and keep automated agents within policy boundaries.

Yet security specialists have warned that testing based on familiar prompts can miss the ways adversaries chain inputs together, disguise intent or exploit surrounding application logic. That means the problem is not limited to the model itself, but can extend to the wider software environment in which it operates.

Mindgard said its offering is intended to help organisations move from vendor claims to independent analysis. It framed continuous testing as necessary not only to confirm security, but also to harden products and reduce exposure in live applications.

Peter Garraghan, Founder and Chief Science Officer at Mindgard and professor of computer science at Lancaster University, said attackers are moving beyond known jailbreak patterns.

"Attackers do not rely on set prompts or familiar jailbreaks frequently found within publicly available datasets, they adapt to other forms of adversarial prompting, such as manipulating context, fragmenting instructions, and translation," said Peter Garraghan, Founder and Chief Science Officer at Mindgard and professor of computer science at Lancaster University.

He said benchmark performance can create a false sense of safety if organisations do not also test for adaptive threats.

"Guardrails that perform well on known benchmarks are failing against adaptive attackers, because security teams need to test continuously as attacks evolve. I built Mindgard to enable organizations to better understand their risk, and this new hybrid assessment tool helps to close the gap between claimed performance and real-world resilience," Garraghan said.

Mindgard was spun out of AI security research at Lancaster University and is based in Boston and London. The company focuses on identifying vulnerabilities in AI models, agents and applications before they are exploited.

The GuardBuster launch places Mindgard in a growing segment of the cybersecurity market centred on AI assurance, where suppliers are trying to provide independent testing as companies expand their use of generative AI across business systems.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google

Image: Aaron Portnoy and Peter Garraghan