AI vs. AI – code and reviews only count if they survive an attack
Adversarial multi-agent verification when best-of-N sampling is already well-documented elsewhere.
Two Claude Code skills that run a coding task through a multi-agent harness — plan → N parallel implementations → adversarial verification → judge. pantheon (Claude-only), pantheon-x (GPT-5.5 cross-model verify). MIT.
Adversarial reviewers try to break builds before self-written tests rubber-stamp them.
Developers using Claude Code for complex coding tasks
Cursor · Continue · LangGraph
Adversarial multi-agent verification when best-of-N sampling is already well-documented elsewhere.
Useful Claude Code skill, but it's config for an existing tool, not a product.
Rust-style error diagnostics for AI feedback when nothing else formats like this.
RSS aggregator with vote filtering wrapped as a Claude Code skill.
Adversarial verification catches bugs self-written tests miss — genuine architectural novelty.
Multi-wave code review with 20+ specialists reading each other's findings before final analysis.