AgentCost – Track, control, and optimize your AI spending (MIT)
One-line wrapping eliminates invisible LLM spend; real cost forecasting and model recommendations.

Deterministic verification loop makes 3.8B models match 7x larger ones for structured extraction.
Developers using local LLMs for structured data extraction
Instructor · Pydantic · guidance
One-line wrapping eliminates invisible LLM spend; real cost forecasting and model recommendations.
RFC 3339 hits 88% accuracy while unix epoch fails 50% of the time.
Deterministic prompt compression cuts tokens 50-80% without extra model calls.
Replay-first architecture beats LangSmith's static traces for debugging non-deterministic agents.
Compressed JSON bundles fit tight context windows better than pasting files.
Treats model calls as first-class runtime constructs using $..$ blocks with declared-type enforcement, which makes it unusually ergonomic to intermix deterministic code and LLM-powered operations. The Polars-backed dataframe injection that emits structured JSON summaries instead of raw table dumps is a clever, practical touch for token efficiency; neat sandbox for language designers, but explicitly a hobby/toy project rather than something to deploy.