AI agent got 237 rules from another agent, still made the same mistakes
Auto-promotes recurring corrections to rules when agents keep making the same mistakes.

Rules don't transfer across contexts—executable mechanisms do, per field study.
AI agent developers and researchers
LangChain memory systems · Agent correction tracking tools
When I transferred 237 of those corrections as rules to a new agent to save time with onboarding in a new repo, it made 44 new mistakes. 13 were in categories the rules explicitly covered. The rules were present in context. The behavior wasn't there. I published the field study with full correction logs.
Then Meta's Superintelligence Labs published HyperAgents (arXiv:2603.19461, March 2026). They found the complementary result: improvements DO transfer across domains when embodied in executable mechanisms (persistent memory, performance tracking, eval loops), not when written as rule text. Two independent studies, same boundary: documentation is not behavior.
So I built Calx. pip install getcalx gives you a CLI + MCP server that:
Captures corrections developers make to AI agents Detects recurrence via keyword similarity (Jaccard), auto-promotes at 3x threshold Promotes recurring corrections to enforced rules and hooks, injected at session start Scopes rules per domain/directory so each agent gets only what's relevant
It runs as a FastMCP server over Streamable HTTP (SQLite locally) so any MCP-compatible client connects: Claude Code, Claude Desktop, Cursor, custom agents. It is primarily designed for Claude Code. It also handles token discipline (prevents context compaction from destroying correction signal), multi-agent orchestration, session lifecycle hooks, orientation gates, and dirty-exit recovery.
The difference from agent memory tools: existing agent memory systems store information for retrieval. Calx tracks the behavioral plane, how an agent works with a specific person, not just what it knows. The data shows the information plane alone doesn't reliably change behavior.
v0.5.0, 443 tests, MIT license. Paper with full evidence: https://doi.org/10.5281/zenodo.19159223
Auto-promotes recurring corrections to rules when agents keep making the same mistakes.
Canonical APIs and JSON error patches stop agents from guessing syntax.
Logs show 73% of Claude Code tokens are re-reading files you already paid to read.
Claude agent that learns from mistakes via structured signal tracing—interesting approach, unclear real-world ROI.
Formal verification for AI agents before compilation, unlike LangChain or AutoGen.
SQL-for-agents concept, but orchestration frameworks already exist; execution clarity unclear.