A governance pattern for self-evolving AI skills
Claude Code Skill pattern paper—interesting theory, but unclear if it ships as a usable tool today.
A design pattern for Claude Code Skills that improve through use — more accurate, more efficient, never bloated. | 越用越准、越用越快、但不臃肿的 Skill 设计模式
Five-gate validation framework prevents skill knowledge drift, but experiments limited to one domain.
Claude Code Skill developers, domain specialists building AI assistants for database, codebase, or business system tasks
Anthropic Prompt Caching · Few-shot examples in system prompts · Knowledge base patterns for LLM agents
Five rounds on a MySQL database (29 tables, 590MB). Key results:
- Five-Gate rejection rate: 63.6% — most interactions produce no knowledge change - Incremental convergence: +75 → +46 → +12 → +21 → +1 - Gate 2 self-correction: caught and fixed 2 erroneous rules the Skill had written earlier - Round 5: zero exploration steps, direct template reuse - Accuracy: 100% (no incorrect knowledge survived)
Unexpected finding: tool usage pitfalls were captured as a high-value byproduct.
A second experiment on a larger telecom billing database is in progress for cross-domain validation.
Claude Code Skill pattern paper—interesting theory, but unclear if it ships as a usable tool today.
Recursive repair loops improve skills automatically, unlike static Claude Code defaults.
Turns pass/fail eval signals into reusable skills without retraining the model.
AI-built language with runnable self-hosting sketches, but human+AI benefits remain unclear.
Lightweight A/B testing for SKILL.md files when LangSmith feels too heavy.
Repurposes BOINC for AI-driven science where agents design and verify physics experiments autonomously.