RunbookAI – Hypothesis-driven incident investigation agent(open source)
Hypothesis-pruning incident agent with approval gates beats chaos engineering explorers.
Hypothesis-driven AI agent for incident investigation. AWS, K8s, PagerDuty.
The project converts on-call triage into a hypothesis-driven agent that forms and prunes hypotheses, fetches evidence from CloudWatch/Kubernetes and your runbooks, and surfaces an investigation plus approval-gated remediation steps. I like the npx demo, read-only-by-default K8s stance, and built-in audit trail; the obvious caveat is its dependence on proprietary LLM keys and the ops work needed before trusting any mutating actions in production.
SREs, on-call engineers, platform teams and DevOps engineers
You get paged at 3 a.m., open six dashboards, grep through logs, check recent deploys, and try to piece together what broke. Most of this is pattern matching that follows a predictable decision tree. Exactly what an AI agent should be doing.
RunbookAI is an open-source project that can understand your infra, ingest runbooks and investigate issues so you have an investigation ready to look at when you get paged at 3am.
Hypothesis-pruning incident agent with approval gates beats chaos engineering explorers.
Finally lets agents use git add -p and vim by wrapping interactive PTY sessions.
Eight specialized AI agents cleaning Metabase messes faster than manual audits.
No-SDK agent tracking: paste a prompt, get a live public status URL.
Autonomous AI agent investigates infra anomalies within a saturated observability market.
Retina-aware screenshot + deterministic coordinate mapping for agent desktop control.