Back to browse
GitHub Repository

Let Codegen Agents finally understand API docs.

8 starsTypeScript

API Ingest – Agentic Search (Inter) API Docs

by mohidbutt·Apr 22, 2026·3 points·2 comments

AI Analysis

●●SolidBig BrainShip It

Deterministic endpoint lookup by operationId beats Context7's semantic search for API precision.

Strengths
  • OperationId-based lookup eliminates semantic search fuzziness for API calls
  • Token-efficient lazy loading chunks instead of full spec in context
  • Works across Claude Code, Cursor, and any MCP-compatible client
Weaknesses
  • Only 6 stars suggests limited real-world validation so far
  • API docs for agents becoming crowded with Context7 and others
Category
Target Audience

Developers building AI agents that interact with external APIs

Similar To

Context7 · GitBook MCP

Post Description

1. CC / Codex dont handle API Docs well enough

No matter what I do, I run into bad requests with claude, day in, day out.

Its making up arguments, misunderstands required types, and misses fields in the requests. And when it catches its issues, the then inititated web search usually ends fuzzy scraped information, that yields even more issues.

Context7 helps. Its better than starting only with the LLM's vague (mis)understanding from pretraining. But it only does semantic search. And often times, semantic search is not precise enough for hyper-precision needed for API requests: CC runs into the same misunderstanding issues as above. And burns tons of tokens in the process.

2. What about Determistic Search in OpenAPI Specs?

In my opinion agents need 1) understanding the damn thing holistically, and 2) ability to do some type of agentic search within the docs.

Thankfully, we do have magnificiently standardized formats for API schemas, most notably OpenAPI/Swagger. Why is no one (to my best knowledge) making use of it?

As I need to work a lot with APIs, I started to build something myself few months ago. In the end its a simple python script that splits the JSON/YAML/RAML/etc files into a) a holistic overview ("manifest"), and b) indexed chunks (by endpoints, tags, and schemas) md files. Agents can access via MCP. It takes a) convert local files, or b) community-converted files, and give the agent the capability to do agentic search on the specs. You can check it out out here, and hook up the MCP server: https://github.com/mohidbt/api-ingest

3. Should we benchmark this? // Feel free to contribute!

WDYT? I am thinking about quantifively corroborating my assumptions, by doing some type of evals. And yes, this by endpoint indexing approach also has many limitations. I.e. when the individual chunks are themselves way too big to load fully into context.

Geniunely curious about all your thoughts

PS: Yes, for many - especially AI-tech - companies, we already have agent optimized API doc formats, like llms.txt in the docs, or skills built for using the APIs; and thats wonderful! But whats with, i.e. Semantic Scholar Graph APIs? What do you do if core CC & Context7 fail? Check out this example: https://github.com/mohidbt/api-ingest/tree/main#opus-47-exam...

Similar Projects

Developer Tools●●Solid

Turn any OpenAPI spec into agent-callable skills

It extracts focused, executable operations from giant OpenAPI files (the GitHub REST YAML is shown) to shrink context and avoid sidecar adapter sprawl — a pragmatic answer to token bloat and brittle ad-hoc integrations. Useful and concrete: if it actually generates tidy, updateable skill units and runtime hooks it saves a lot of maintenance. That said, the idea competes with existing LangChain/openai-function patterns; the repo will need clear runtime, versioning, and update strategies to feel like more than a nicer converter.

Solve My ProblemNiche Gem
yz-yu
103mo ago