GitHub Repository

Open-source LLM router & AI cost optimizer. Routes simple prompts to cheap/local models, complex ones to premium — automatically. Drop-in OpenAI-compatible proxy for Claude Code, Codex, Cursor, OpenClaw. Saves 40-70% on AI API costs. Self-hosted, no middleman.

616 starsPython

NadirClaw, LLM router that cuts costs by routing prompts right

Name: NadirClaw, LLM router that cuts costs by routing prompts right
Availability: InStock
Author: amirdor

by amirdor·Feb 17, 2026·1 point·1 comment

Visit Project View on HN

AI Analysis

●●SolidSolve My ProblemNiche Gem

The Take

If you're burning through Claude/OpenAI credits, this is a low-friction stopgap: it classifies prompts in ~10ms and routes trivial tasks to cheaper/local models while reserving premium APIs for complex work. The agentic-task detection, reasoning-aware routing, session pinning and context-window fallback are practical touches that avoid mid-thread model bouncing and 429 failures. It isn't reinventing the space (OpenRouter and others exist), but it's focused on real-world cost tradeoffs and drop-in compatibility.

Post Description

I use Claude and Codex heavily for coding, and I kept burning through my quota halfway through the week. When I looked at my logs, most of my prompts were things like "summarize this," "reformat this JSON," or "write a docstring." Stuff that any small model handles fine.

So I built NadirClaw. It's a Python proxy that sits between your app and your LLM providers. It classifies each prompt in about 10ms and routes simple ones to Gemini Flash, Ollama, or whatever cheap/local model you want. Only the complex prompts hit your premium API.

It's OpenAI-compatible, so you just point your existing tools at it. Works with OpenClaw, Cursor, Claude Code, or anything that talks to the OpenAI API.

In practice I went from burning through my Claude quota in 2 days to having it last the full week. Costs dropped around 60%.

curl -fsSL https://raw.githubusercontent.com/doramirdor/NadirClaw/main/... | sh

Still early. The classifier is simple (token count + pattern matching + optional embeddings), and I'm sure there are edge cases I'm missing. Curious what breaks first, and whether the routing logic makes sense to others.

Repo: https://github.com/doramirdor/NadirClaw

Similar Projects

Developer Tools●●Solid

NadirClaw – Open-source LLM router with 10ms classification

Smart LLM routing cuts costs, but competing against established OpenRouter and vLLM ecosystems.

Solve My ProblemBig Brain

amirdor

105mo ago

Developer Tools●●●Banger

InferShrink – Cut LLM API costs 10x with automatic model routing

Three-line wrapper cuts LLM costs 80%+ via prompt classification and same-provider routing.

Solve My ProblemShip It

doronp

204mo ago

Infrastructure●●Solid

Cascade – A bare-metal C++ proxy that cuts LLM API bills by 70%

ONNX embeddings predict prompt complexity before routing—LiteLLM does this with rules.

Big BrainSolve My Problem

AmixxM

2025d ago

Infrastructure●●Solid

Foreman, a self-hosted LLM gateway for cost aware model routing

Cache-aware LLM routing that doesn't burn prompts to save pennies.

Solve My ProblemNiche Gem

AndrewLiu96

151611d ago

AI/ML●●Solid

Soup – organize and route Agent Skills into your LLM prompts

Self-integrating via Cursor skill is clever, but RAG-based context routing already exists.

Big BrainShip It

kegenaar

2011d ago

Developer Tools●●Solid

API router that picks the cheapest model that fits each query

Komilion turns model sprawl into a cost-control layer you drop in by swapping a base_url: requests are classified (regex fast path + tiny LLM) and matched to ~400 models so cheap models handle the easy stuff and premium models only run when needed. The ~60% zero‑call regex fast path and benchmark-driven routing (LMArena) are clever, pragmatic moves; the hard questions left are model-quality drift across providers and how routing decisions map to real-world user satisfaction.

Solve My ProblemWizardrySlick

robinbanner

115mo ago