Back to browse
GitHub Repository

agent-replay is a 100% local, SQLite-powered CLI tool for time-travel debugging AI agents that lets you replay execution traces, diff behavioral changes, fork runs to test fixes, and run AI-powered evaluations or safety guardrails to eliminate hallucinations and production failures.

5 starsTypeScript

Time-travel debugging and side-by-side diffs for AI agents

by hireclay·Feb 28, 2026·1 point·0 comments

AI Analysis

●●●BangerSolve My ProblemShip ItBig Brain

Replay, fork, diff, eval agent traces locally—like Git for agent behavior, fills a real gap.

Strengths
  • Time-travel debugging (replay, fork from any step, change input mid-trace) is genuinely novel for agents
  • SQLite local-first architecture means zero cloud dependency, works offline with full trace history
  • Automatic evaluations (hallucination detection, guardrails, golden datasets) solve agent quality blind spot
Weaknesses
  • Work-in-progress status; unclear if core replay/fork mechanics are fully functional or MVP promises
  • No comparison to Langfuse, Arize, or Weights & Biases tracing; positioning vs. observability platforms fuzzy
Target Audience

AI agent developers, teams running production agentic systems, prompt/model debugging workflows

Similar To

Langfuse · Arize AI · Weights & Biases Traces

Post Description

agent-replay provides time-travel debugging and side-by-side diffs to pinpoint exactly where AI agents hallucinate or fail. It replaces manual log diving with a local-first toolkit to replay, fork, and automatically evaluate agent traces for faster iteration. It's a work-in-progress. I'd love any feedback. Thank you.

Similar Projects

AI/ML●●Solid

EPI – Cryptographically verifiable execution artifacts for AI agents

Turns an agent run into a verifiable .epi bundle you can hand to auditors or replay locally for debugging. Concrete engineering choices stand out — crash-safe SQLite WAL storage, Ed25519 sealing, and an embedded viewer — though wider integrations (Kubernetes/CICD hooks, verifier tooling) and stronger ecosystem docs will be needed for real adoption.

Niche GemWizardry
afridi_epilabs
103mo ago