Agent Red Team – Adversarial testing for AI agents before production
Tests agent actions and tool calls, not just output, with deterministic code validation.

Adversarial AI agent turns SEC rules into automated compliance tests.
Regtech and fintech companies, compliance teams at financial services firms
Vanta · Drata · Secureframe
For example, we work in reg tech, so bugs aren’t always technical. What we often see is things like insider trading alerts that should’ve fired that didn’t. We wanted an agent that turns laws and regulations into tests.
For now, users can upload PDF, MD, TXT, and DOCX files, but we’re planning integrations like Slack, Notion, Linear, and Zoom in the future.
We’re early on, so we would love to know what you all think!
Tests agent actions and tool calls, not just output, with deterministic code validation.
Auto-patching LLM calls to inject faults and log telemetry is a neat technical trick that lets you fuzz real agent runs without changing your stack. The repo ships six intentionally vulnerable example agents and a CLI (discover/run/ci) with eval packs for security and resilience, so you can reproduce attacks and gate releases. It feels like an early, practical toolkit that fills a gap in agent security testing — adoption and more community-playbooks will determine how far it goes.
Replay production queries against shadow DB to catch 92x regressions before they ship.
Passing self-run Jepsen tests is a strong signal, even without Kyle's stamp.
Agent testing platform, but screenshot only shows login page—no actual product demo or proof.
VCR cassettes for agent tool sequences—catches prompt regressions before deploy.