Back to browse
CSL MCP Server – Write and Verify AI Safety Policies from Claude/Cursor

CSL MCP Server – Write and Verify AI Safety Policies from Claude/Cursor

by aytuakarlar·Feb 18, 2026·1 point·1 comment

AI Analysis

●●●BangerBig BrainZero to One

Mathematically verified policies enforced outside the model—formal proof replaces prompt engineering.

Strengths
  • Z3 formal verification eliminates logical contradictions at compile time, not runtime
  • MCP integration lets you scaffold + verify + test policies from Claude/Cursor without leaving editor
  • Model-agnostic enforcement means policies work across OpenAI, Anthropic, Llama—true portability
Weaknesses
  • Alpha status (0.3.0a1) with stated breaking changes; production use requires "thorough testing"
  • Unclear if real orgs are adopting this or if it remains a research prototype/proof-of-concept
Target Audience

AI safety teams, enterprise deploying constrained AI agents, compliance-heavy orgs

Similar To

Anthropic's Constitutional AI · Policy-as-code tools like OPA · Guard rails and gate frameworks

Post Description

CSL-Core is a policy engine that formally verifies AI agent constraints using Z3. Instead of prompting an LLM to behave, you write a small policy in CSL (Constitutional Specification Language), Z3 proves it has no contradictions at compile time, and a deterministic runtime enforces it — completely outside the model. We just shipped a built-in MCP server with 4 tools:

verify_policy: Z3 formal verification in one call

simulate_policy: test any JSON input, get ALLOWED/BLOCKED

explain_policy: human-readable breakdown of any policy

scaffold_policy: describe what you want in English, get a CSL template

This means you can do this from Claude Desktop or Cursor:

"Write me a policy that blocks transfers over $5000 for non-admin users"

→ scaffold generates a CSL template

→ verify proves it has no contradictions

→ simulate tests your edge cases

→ all without leaving your editor

The full loop: From English description to mathematically verified, runtime-enforced policy; happens inside your AI assistant.

Why not just prompt the LLM to enforce rules? We benchmarked GPT-4o, Claude Sonnet 4, and Gemini 2.0 Flash as guardrails with a hardened system prompt. Every model was bypassed by at least one attack (context spoofing, multi-turn role escalation, unicode homoglyphs). CSL-Core blocked all of them; because the LLM never touches the enforcement layer.

Setup:

pip install "csl-core[mcp]"

Claude Desktop config:

{

"mcpServers": {

"csl-core": {

"command": "csl-core-mcp"

}

}

}

Or with Docker:

docker build -t csl-core-mcp .

docker run -i csl-core-mcp

GitHub: https://github.com/Chimera-Protocol/csl-core

Listed on awesome-mcp-servers:

https://github.com/punkpeye/awesome-mcp-servers

Previous Show HN discussion: https://news.ycombinator.com/item?id=46963250

Would love feedback, especially from anyone building agentic systems where safety guarantees matter.

Similar Projects

AI/ML●●Solid

MCP server for persistent context and safety in Claude Code

Harness-level command interception blocks dangerous operations, not just prompt-based safety.

Solve My ProblemDark Horse
pinip833
211mo ago
AI/MLMid

Agent Brain Trust, customisable expert panels for AI agents

Expert panel AI orchestration, but 2 GitHub stars says it all.

Ship It
bahulneel
501mo ago