Agent-skills-eval – Test whether Agent Skills improve outputs
Lightweight A/B testing for SKILL.md files when LangSmith feels too heavy.
Unofficial Substack API reference — 129 verified endpoints with body shapes, gathered across 14 capture rounds. Read + write side fully mapped. Includes Claude Agent SDK skill manifest and TypeScript client.
Comprehensive Substack API docs, but it's still just documentation for an unofficial API.
Developers building Substack integrations or AI agents
substack_api (NHagar) · python-substack (ma2za)
Discovered, mapped and tested 129 endpoints so you can fully control Substack via an agent.
Lightweight A/B testing for SKILL.md files when LangSmith feels too heavy.
First benchmark measuring semantic correctness over text similarity for document parsing.
Enforces test independence in AI agents to break the confirmation bias loop.
Formal verification for agent skills when heuristic scanners always fail.
Hash-verified doc citations enforce truth—genuinely solves AI agent hallucination on stale docs.
Code-reformatting skill to read AI output faster, but narrow scope and unproven impact.