ClawSandbox – 7/9 attacks succeeded against an AI agent w/ shell access
First systematic attack framework proving 7/9 exploits work on AI agents with shell access.

Impressive case study numbers, but this is marketing content not a product launch.
Security teams and event organizers running live infrastructure
Cloudflare · AWS Shield · Fastly
First systematic attack framework proving 7/9 exploits work on AI agents with shell access.
Sub-second DDoS mitigation on your servers, but Cloudflare and AWS Shield dominate.
Reimplements dependency functions locally with test verification, challenging the "dependencies are good" mantra.
Agent red-teaming via UI, but attack catalog is shallow and comparison unclear vs. manual testing.
Multimodal evals with file normalization across endpoints — LangSmith doesn't do this.
Replaces manual Playwright scripting, but Claude-generated tests and GitHub Copilot already cover this.