Back to browse
Edictum – Runtime governance for LLM agent tool calls

Edictum – Runtime governance for LLM agent tool calls

by acartag7·Feb 25, 2026·2 points·0 comments

AI Analysis

●●●●GemZero to OneBig BrainWizardry

Proves text safety ≠ tool-call safety; catches hidden harmful executions deterministically.

Strengths
  • GAP benchmark (17,420 datapoints across 6 models, 6 domains) is rigorous research—found models refuse text but execute tool calls anyway
  • Deterministic 55μs-per-eval YAML contracts eliminate LLM-in-loop failure modes and comply with regulated domains
  • Works across LangChain, CrewAI, OpenAI SDK, Claude Agent SDK—genuinely tool-agnostic integration
Category
Target Audience

AI agent developers, enterprise deploying Claude/OpenAI agents, compliance teams

Post Description

We tested 6 frontier models across 17,420 tool-call interactions and found that models consistently refuse harmful requests in text while executing them through tool calls. We call this divergence the GAP metric. The text says no. The tool call says yes. Edictum is a runtime governance library that enforces safety contracts at the tool-call boundary — the point where you have the tool name, the arguments, and the ability to block before execution. YAML contracts with preconditions, postconditions, PII redaction. Deterministic allow/deny/redact, no LLM-in-the-loop. Zero runtime dependencies, 55μs per evaluation, works with LangChain, CrewAI, OpenAI Agents SDK, Claude Agent SDK, Agno, Semantic Kernel, and nanobot. MIT licensed. Paper: https://arxiv.org/abs/2602.16943 GitHub: https://github.com/acartag7/edictum

Similar Projects

Developer Tools●●Solid

Core – Constitutional governance runtime for AI coding agents

Constitutional enforcement blocks AI agent violations at runtime, but unclear if practical for most teams.

Big BrainBold Bet
DNewecki
123mo ago