GitHub Repository

Route prompts to the cheapest model that handles them. Claude + GPT-4o + Groq. Live cost tracking. Built with pydantic-ai + litellm.

0 starsPython

Route LLM prompts to cheapest capable model – pydantic-AI and litellm

Name: Route LLM prompts to cheapest capable model – pydantic-AI and litellm
Availability: InStock
Author: reactance0083

by reactance0083·Jun 23, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainSolve My Problem

pydantic-ai structured routing decides cheapest model before litellm executes.

Strengths

•Structured RoutingDecision model with reason field makes routing decisions auditable.
•Live /stats endpoint tracks cost per model with token counts and call history.
•Quality tiers (fast/standard/quality/max) map cleanly to use cases and budgets.

Weaknesses

•LLM routing is well-served by LiteLLM, LangChain, and provider-native solutions already.
•Using one LLM to route to another adds latency and its own cost overhead.

Similar Projects

Developer Tools●●●Banger

InferShrink – Cut LLM API costs 10x with automatic model routing

Three-line wrapper cuts LLM costs 80%+ via prompt classification and same-provider routing.

Solve My ProblemShip It

doronp

203mo ago

AI/ML●●Solid

Kronaxis Router – Don't pay frontier prices when a local LLM is enough

LLM cost routing with LoRA awareness when LiteLLM already handles basic proxying.

Big BrainSolve My Problem

JasonDuke

202mo ago

Developer Tools●Mid

acorn – LLM framework for long running agents

Yet another LLM orchestration layer over LiteLLM + Pydantic when DSPy and LangChain dominate.

Ship ItSlick

onel

503mo ago

Developer Tools●●Solid

Prismag – Per-block model routing for the terminal and any IDE

@@ syntax avoids IDE collisions and chains block outputs in one prompt.

Big BrainShip It

arthur-G

501d ago

Developer Tools●●Solid

API router that picks the cheapest model that fits each query

Komilion turns model sprawl into a cost-control layer you drop in by swapping a base_url: requests are classified (regex fast path + tiny LLM) and matched to ~400 models so cheap models handle the easy stuff and premium models only run when needed. The ~60% zero‑call regex fast path and benchmark-driven routing (LMArena) are clever, pragmatic moves; the hard questions left are model-quality drift across providers and how routing decisions map to real-world user satisfaction.

Solve My ProblemWizardrySlick

robinbanner

114mo ago

Developer Tools●●Solid

NadirClaw, LLM router that cuts costs by routing prompts right

If you're burning through Claude/OpenAI credits, this is a low-friction stopgap: it classifies prompts in ~10ms and routes trivial tasks to cheaper/local models while reserving premium APIs for complex work. The agentic-task detection, reasoning-aware routing, session pinning and context-window fallback are practical touches that avoid mid-thread model bouncing and 429 failures. It isn't reinventing the space (OpenRouter and others exist), but it's focused on real-world cost tradeoffs and drop-in compatibility.

Solve My ProblemNiche Gem

amirdor

114mo ago