Back to browse
GitHub Repository

A CLI tool that measures how well AI coding agents (Claude Code, Codex, Gemini CLI, etc.) can use your SDK.

15 starsTypeScript

I built a way to see if your SDK is AI-friendly

by nguyenhu·Apr 28, 2026·4 points·0 comments

AI Analysis

●●●BangerBig BrainZero to One

Sandboxed microVMs test how AI agents use your SDK with separate public/private access.

Strengths
  • Test-taker and judge agents have different access levels, preventing IP leakage while enabling thorough evaluation
  • MicroVM sandboxing isolates agent execution with monitored egress and blocked secret exposure
  • Addresses an emerging problem that didn't exist before AI coding agents became mainstream
Weaknesses
  • Requires Linux with KVM or macOS Apple Silicon, limiting adoption on Windows machines
  • Depends on external AI agent CLIs being installed and configured with API keys
Target Audience

SDK maintainers and library authors

Post Description

Have you ever wonder if your SDKs is friendly for Agentic AI like Claude Code or Codex? I built an opensource (Apache 2.0) CLI that answer that question for you.

With it you can create a test suite either manually or with an Agent based on the source code and documentation. The CLI will dispatch Agents with their own sandboxed microVMs to solve each test. Results then get graded by another Judge Agent.

Test-taker agents only have access to public information (guides, blogs, package metadata), while Judge agents have access to both public and private information (source code, internal documents)

After the test result are generated you can make improvement to your SDK manually, or use an Agent to automate the process.

Agents are sandboxed, this means: - Host machine secrets (API keys) are not exposed to the sandbox environment - Egress HTTP requests are monitored, Judge agents' egress are limited to trusted domains to ensure that proprietary IP are not exfiltrated

Features: - CLI commands for the entire workflow of generating, eval, reporting on test suites - Agent skills for each command - Local Web UI if you want to inspect test result and edit test cases visually

GitHub: https://github.com/PSPDFKit-labs/agentic-usability

Similar Projects