GitHub Repository

Orchid - Orchestration interactive debugger - Record, inspect, & replay AI agents

1 starsPython

Orchid – Local-first record and replay for AI agent debugging

Name: Orchid – Local-first record and replay for AI agent debugging
Availability: InStock
Author: brightmonkey

by brightmonkey·Jun 24, 2026·3 points·0 comments

Visit Project View on HN

AI Analysis

●●●BangerBig BrainSolve My Problem

Deterministic replay of agent runs without mocking—that's genuinely new.

Strengths

•Zero-instrumentation proxy captures all traffic without code changes to existing agents
•Replay feature enables deterministic testing without expensive API calls or mocking
•MCP server integration lets coding agents debug your AI app directly from IDE

Weaknesses

•Redaction only works on field names, not prompt contents—secrets in prompts still stored
•Replay requires all network traffic through proxy, partial recording won't replay correctly

Post Description

Orchid (Orchestration interactive debugger) is a zero-instrumentation proxy that captures every API & LLM call in your agent pipeline, then lets you inspect and replay the entire run locally, step by step. No instrumentation, no vendor lock-in, no cloud dependency. It also provides a visual inspector and MCP server, so you can inspect the session yourself or use your favorite agentic coding IDE to debug your agent runs.

I built it because I was tired of debugging agent failures by grepping through logs, and the available AI observability tools all seemed to require intrusive instrumentation and/or sending my prompts and responses to a cloud service. I wanted something that would let me debug agent runs locally, without having to worry about vendor lock-in or data privacy.

Orchid is that tool. The call inspection features work extremely well, at least for my use cases, but the replay feature is perhaps more interesting. It makes LLM pipeline testing deterministic without mocking or re-running expensive API calls.

Free, self-hosted, runs on your machine or infrastructure: https://github.com/mario-guerra/orchid-trace

Would love feedback from anyone building multi-step agentic systems or struggling with non-deterministic LLM test failures.

Similar Projects

AI/ML●●●Banger

Time Machine – Debug AI Agents by Forking and Replaying from Any Step

Fork from step 8 and replay downstream — saves money when agents fail at step 9.

Solve My ProblemZero to One

deva00

213mo ago

Developer Tools●●●Banger

mcp-recorder – VCR.py for MCP servers. Record, replay, verify

Catches silent MCP breakage VCR.py never could—schema drift detection.

Solve My ProblemBig BrainShip It

caballeto

603mo ago

Developer Tools●●Solid

GhostTrace – See rejected decisions in AI agents

Recording what an agent considered — not just what it executed — is a tidy, concrete insight. GhostTrace already gives record/replay commands, a .ghost.json schema and a --show-phantoms terminal replay so you can inspect rejected actions and the agent's reasoning. The thing that will decide if this takes off is integrations (LangChain/OpenAI Agents/CrewAI) and the promised web/VS Code UIs; without those it's a very useful niche tool, not yet a platform.

Niche GemShip It

AhmedAllam0

114mo ago

Infrastructure●●●Banger

Air Blackbox – Open-source flight recorder for AI agents

Flight recorder for AI agents: record, replay, enforce policies on every LLM call.

WizardrySolve My ProblemZero to One

shotwellj

104mo ago

AI/ML●●Solid

EPI – Cryptographically verifiable execution artifacts for AI agents

Turns an agent run into a verifiable .epi bundle you can hand to auditors or replay locally for debugging. Concrete engineering choices stand out — crash-safe SQLite WAL storage, Ed25519 sealing, and an embedded viewer — though wider integrations (Kubernetes/CICD hooks, verifier tooling) and stronger ecosystem docs will be needed for real adoption.

Niche GemWizardry

afridi_epilabs

104mo ago

Developer Tools●●●Banger

Retrace – reverse debugging for production CPython applications

Record production Python bugs and step backwards from crash to cause in VS Code.

Zero to OneWizardry

L15p3r

1441mo ago