GitHub Action that diagnoses CI failures with Claude AI
Claude digests CI logs and posts fix steps on PRs—but Anthropic's own GitHub action exists.
SQLite history and Markdown postmortems help, but incident diagnosis via LLM is a crowded category.
SREs and DevOps engineers
Datadog AI · PagerDuty · Incident.io
I've been building Autopsy, an open-source CLI tool to solve a problem I hate: manually digging through CloudWatch logs during a 3 AM page.
It pulls your CloudWatch logs and GitHub deploys, passes them to an LLM, and returns a structured root cause analysis in your terminal in about 30 seconds.
The early version worked, but the diagnosis vanished the moment you closed your terminal. I just shipped v0.2.2 to make it actually fit into an SRE/DevOps workflow:
Local History: Every run now saves automatically to a local SQLite DB. You can search, browse, and export past incidents (autopsy history list).
Instant Post-Mortems: Passing --postmortem generates a full incident report in Markdown, including a timeline, evidence, and checkbox action items.
Slack Integration: Passing --slack pushes the diagnosis directly to your incident channel via webhook.
You can chain them: autopsy diagnose --postmortem --slack
A big priority here is privacy. It’s built with Python, runs locally, and your logs never leave your machine (aside from the direct payload you choose to send to the LLM).
GitHub: https://github.com/zeelapatel/autopsy PyPI: pip install autopsy-cli
I'd love for you to try it out. I'm especially looking for feedback on the LLM's diagnosis accuracy across different types of stack traces and log formats. Happy to answer any questions!
Claude digests CI logs and posts fix steps on PRs—but Anthropic's own GitHub action exists.
Root-cause correlation for Lambda logs—solves real CloudWatch debugging pain better than manual Insights queries.
Unified ops schema is useful but Backstage already does this with more adoption.
Content-aware rule matching on diffs surfaces only relevant decisions, not noise.
Slack + logs → board-ready incident report with sourced evidence claims.
Ambient audio cues for error rates—leaves app running to hear infrastructure degradation without staring at dashboards.