I ran every Claude agent turn through the Batch API
50% cheaper tokens but 2-minute waits kill interactive agent UX.
Turns existing LangChain pipelines into first-class batch jobs you can submit to provider batch endpoints without rewriting your chains. It automates JSONL uploads, polling (BatchPoller), provider-specific parsing into unified BatchItem results, partial-failure handling, and on-disk job persistence so batches can outlive your process — Vertex/Azure support is on the roadmap.
Backend/ML engineers and LangChain developers running high-volume or cost-sensitive LLM workloads
The problem is the batch API interface is completely different from the real-time one. OpenAI requires JSONL file uploads and polling. Anthropic has its own Message Batches format. If you have an existing LangChain pipeline, you'd have to rewrite it.
langasync wraps both batch APIs behind LangChain's Runnable interface:
batch = batch_chain(prompt | model | parser) job = await batch.submit(inputs) results = await job.get_results()
It handles file formatting, submission, polling, result parsing, partial failure handling, and job persistence (batch jobs can outlive your process, so metadata is stored to disk and you can resume later).
Python, Apache 2.0, on PyPI. Vertex AI and Azure OpenAI on the roadmap.
50% cheaper tokens but 2-minute waits kill interactive agent UX.
Hub-and-spoke IR translates LLM APIs without N^2 adapter hell.
Multi-provider cost dashboard when LangFuse and Helicone already do this.
One-line instrumentation (client = track(OpenAI())) that logs every call to ~/.token-guard/guard.db is the simple, practical win here — you get per-model/day reports, CSV export and webhook alerts without shipping telemetry. It's a tidy, privacy-minded tool that nails the common pain of runaway bills, though it doesn't reinvent the space (a small GUI or automatic pricing sync would push it into 'banger' territory).
OpenAI SDK calls Claude through one proxy with conformance-tested wire translation.
Route Claude Code work to Batch API; saves money if you don't need instant replies.