Back to browse
GitHub Repository

Behavioral testing framework for AI agents

1 starsTypeScript

TracePact – Catch tool-call regressions in AI agents before prod

by soydanicg·Mar 8, 2026·1 point·0 comments

AI Analysis

●●●BangerSolve My ProblemShip ItNiche Gem

VCR cassettes for agent tool sequences—catches prompt regressions before deploy.

Strengths
  • Solves a real, painful gap: LLM output looks right but tool ordering breaks. No API calls needed for replay/diff.
  • Clear UX: block/warn classification, --fail-on flags, --ignore-keys filtering. Vitest integration for assertions.
  • Works provider-agnostic with GGUF/cassette replay. Deterministic testing previously manual or missing entirely.
Weaknesses
  • Narrow audience: only matters if you're building agentic systems. Ecosystem adoption unknown.
  • Early stage: 1 star, minimal real-world usage signal yet.
Target Audience

AI/LLM engineers, agent builders, ML ops teams

Similar To

VCR.js (HTTP cassettes) · Pytest fixtures for mocking

Post Description

I kept running into the same problem: I'd change a prompt or update my model, the agent output looked fine, but the tool sequence was completely different. It stopped reading config before deploying. It ran npm run build instead of npm test. The bug showed up days later in prod.

So I built TracePact. Record a known-good run as a cassette, diff against new runs, get a clear report of what changed:

- read_file (seq 0) (removed) ~ bash.cmd: "npm test" -> "npm run build" Summary: 1 removed, 1 arg changed[BLOCK]

It classifies changes as block (structural) or warn (args only), so you can gate CI with --fail-on warn. You can filter noise with --ignore-keys timestamp and --ignore-tools read_file.

No API calls needed for replay/diff. Works with any LLM provider. Vitest integration for writing assertions on tool traces.

https://github.com/dcdeve/tracepact

Similar Projects

Security●●●Banger

OpenClaw skills degrade agent safety

Behavioral safety testing reveals 45 regressions static analysis misses—guardrails provided.

Big BrainWizardryZero to One
shadab_nazar
123mo ago
AI/ML●●●Banger

ToolGuard – Pytest for AI agent tool calls

Finally, pytest for AI tool calls when evals only test intelligence.

Solve My ProblemZero to One
Heer_J
122mo ago