GitHub Repository

A design pattern for Claude Code Skills that improve through use — more accurate, more efficient, never bloated. | 越用越准、越用越快、但不臃肿的 Skill 设计模式

24 starsPython

Self-Evolving Skill – empirical results from a 5-round experiment

Name: Self-Evolving Skill – empirical results from a 5-round experiment
Availability: InStock
Author: tiansenxu

by tiansenxu·Mar 8, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●MidBig Brain

Five-gate validation framework prevents skill knowledge drift, but experiments limited to one domain.

Strengths

•Concrete empirical validation with metrics: 63.6% rejection rate, 100% accuracy, gate 2 self-correction.
•Five-gate framework (value, alignment, redundancy, freshness, placement) is generalizable beyond databases.
•Addresses real problem: static skills waste context window rediscovering the same domain patterns.

Weaknesses

•Limited scope: only MySQL validation on 590MB; second database experiment still in progress, no cross-domain proof.
•Niche audience (Claude Code Skill authors) and pattern maturity—published as a design paper, not a finished tool.

Post Description

Last week I shared the design pattern. This week I ran a real experiment to validate it.

Five rounds on a MySQL database (29 tables, 590MB). Key results:

- Five-Gate rejection rate: 63.6% — most interactions produce no knowledge change - Incremental convergence: +75 → +46 → +12 → +21 → +1 - Gate 2 self-correction: caught and fixed 2 erroneous rules the Skill had written earlier - Round 5: zero exploration steps, direct template reuse - Accuracy: 100% (no incorrect knowledge survived)

Unexpected finding: tool usage pitfalls were captured as a high-value byproduct.

A second experiment on a larger telecom billing database is in progress for cross-domain validation.