Business in a Box – ~one-shot a typical startup
Two commands pipeline: checklist discovery → iterative fixes until production-ready, survives context flushes.

120+ built-in test playbooks with JSON output agents can read and fix.
Backend engineers, API developers, QA engineers
Schemathesis · Postman · Dredd
Run `dochia init-skills` and the coding agent(s) can trigger tests as it builds:
1. Agent writes endpoint and the OpenAPI spec (or that get's generated from code) 2. Agent runs: dochia test -c api.yml -s localhost:3000 3. Dochia produces dochia-summary-report.json + per-endpoint test files 4. Agent reads errors, fixes code, re-runs 5. Loop
The JSON output is structured specifically so agents can read and act on it directly, not just humans parsing logs.It's a native binary (GraalVM), so it's fast on all platforms.
Would love feedback on: is it something you will integrate into your flow, which test playbooks are missing, whether the report format is actually useful in agentic loops, any edge cases you'd expect a tool like this to catch?
GitHub: https://github.com/dochia-dev/dochia-cli Docs: https://docs.dochia.dev
For background, Dochia takes your OpenAPI spec and runs 120+ test playbooks: deterministic negative and boundary scenarios plus chaos testing. No test cases to write, no configuration beyond pointing it at your OpenAPI spec and a running server.
Two commands pipeline: checklist discovery → iterative fixes until production-ready, survives context flushes.
Automated code review loop via agent ping-pong, but Cursor already does multi-turn fixing in context.
Sentry-to-PR pipeline writes failing tests first, then fixes the bug.
Persona-driven critique loop is clever, but locked to pi.dev limits adoption.
Agent testing platform, but screenshot only shows login page—no actual product demo or proof.
Error registry catches stuck agent loops before they waste hours of compute.