Back to browse
GitHub Repository

Hypothesis-driven AI agent for incident investigation. AWS, K8s, PagerDuty.

9 starsTypeScript

RunbookAI – Stop scrolling dashboards at 3 a.m., let AI investigate

by EmTekker·Feb 16, 2026·1 point·0 comments

AI Analysis

●●SolidSolve My ProblemNiche GemWizardry
The Take

The project converts on-call triage into a hypothesis-driven agent that forms and prunes hypotheses, fetches evidence from CloudWatch/Kubernetes and your runbooks, and surfaces an investigation plus approval-gated remediation steps. I like the npx demo, read-only-by-default K8s stance, and built-in audit trail; the obvious caveat is its dependence on proprietary LLM keys and the ops work needed before trusting any mutating actions in production.

Target Audience

SREs, on-call engineers, platform teams and DevOps engineers

Post Description

Hey HN, I built this project from personal pain while being on-call.

You get paged at 3 a.m., open six dashboards, grep through logs, check recent deploys, and try to piece together what broke. Most of this is pattern matching that follows a predictable decision tree. Exactly what an AI agent should be doing.

RunbookAI is an open-source project that can understand your infra, ingest runbooks and investigate issues so you have an investigation ready to look at when you get paged at 3am.

Similar Projects