Pentesting Tool Using Claude
Autonomous hunting with Burp MCP integration beats manual recon workflows.
Autonomous QA bug-hunt harness: an agent walks the full app flow, finds bugs, fixes them, and ships each as a PR with inline before/after GIFs. Target app included as a submodule.
Before/after GIFs embedded in PRs let non-technical reviewers see bugs fixed.
Developers using Claude Code for QA automation
Sentry · Bugsnag · Playwright
I had a couple problems: 1. Sometimes the bugs are not actually bugs, just stylistic choices. Fable said it was a bug that my login didn't have a length check error message. I've never seen a length checker on a login. 2. Testing bugs locally takes a long time. 3. Doing testing outside of Github and reviewing code in PRs is a disjointed experience.
Gifhub is a simple Claude plugin that solves these problems. It attaches gifs of the reproduced bug and the fix to prs in Github.
There's a gif in the README.md to show you what this looks like. Showing is better than telling!
Autonomous hunting with Burp MCP integration beats manual recon workflows.
Specific enterprise attack matrices for Entra and Okta beat generic OWASP Top 10 prompts.
AI agent actually fixes bugs in real VMs, not just prompting. Firecracker isolation + verified PRs.
Self-dogfooding via 24/7 agent tasks on Kelos itself; solid but unproven at scale.
Execution-based scoring with live APIs beats LLM-graded benchmarks, but they evaluated themselves.
Pre-collected bug bounty recon data so you skip the scanning phase.