GitHub Repository

A fast, native-protocol LLM gateway with weighted pool composition and correct billing-vs-client failure handling.

0 starsRust

Busbar – every LLM behind one URL, in a single Rust binary

Name: Busbar – every LLM behind one URL, in a single Rust binary
Availability: InStock
Author: mattjackson86

by mattjackson86·Jun 5, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●●BangerSolve My ProblemSlick

Mid-request failover reroutes streaming responses before your client sees a byte.

Strengths

•Lossless protocol translation preserves cache_control, thinking blocks, and citations.
•7.4MB static binary with sub-15ms cold start beats Python sidecar deployments.
•Weighted pool composition turns model selection into config, not code changes.

Weaknesses

•LLM gateway category already has LiteLLM, Portkey, and Helicone with funding.
•AGPL license may limit enterprise adoption compared to MIT or Apache alternatives.

Post Description

I have been working on multiple projects lately involving AI endpoints (including some I run locally) and I found I needed a way to easily load balance across multiple. Sometimes my on-prem would not be able to handle to load and Id have to crank up the z.ai usage or Anthropic depending on where my credits were at the time.

One thing let to another and I ended up writing Busbar: An LLM gateway, written in Rust (I have a thing for Rust lately). You point your existing OpenAI/Anthropic/Gemini SDK at it, change the model to a pool name, and that name now load-balances across the vendors. Your client code doesn't change and never learns it even happened.

My central idea is "protocols, not providers". I implement six protocols - Anthropic, OpenAI, Gemini, Bedrock, Responses, Cohere - losslessly. You define a provider in three lines of YAML, mainly specifying the protocol that provider speaks.

Your client speaks a protocol in to Busbar and Busbar speaks a protocol out to the provider.

- Each protocol translates request and response, streamed and buffered, in both directions. Same-protocol calls pass through untouched; cross-protocol calls reconcile the awkwardness (a field one dialect requires and another makes optional).

- A circuit breaker that knows whose fault a failure is. It stops routing to a backend that's genuinely failing, but it won't penalize a model for a request that was simply too big (it retries on a larger-context model instead), and it won't blame a backend when the caller sent a bad request. A healthy model never gets pulled from rotation for something that wasn't its fault. All issues I have personally faced and wanted to fix one time in busbar vs 10x in 10 applications.

- Hand-rolled AWS implementations so I am not reliant on AWS SDK's: SigV4 and a from-scratch AWS eventstream frame decoder for Bedrock

It's 1.0.0-rc.2 — feature-complete and API-stable, with release-candidate validation underway before 1.0.0. I have been using it on my projects and its solving my problems nicely.

Solo project, AGPL-3.0. The AGPL choice is open to discussion; I know it matters for a request-path component.

Feedback very welcome, particularly on where the translation might still be lossy in edge cases. Contribution and conversation desired!

Similar Projects

Developer Tools●●Solid

Turn your Google accounts into a free, load-balanced LLM API gateway

Multi-account rotation with cooldowns beats single-account rate limits.

Big BrainShip It

ariozgun

459d ago

Infrastructure●●Solid

LLM-Gateway – Zero-Trust LLM Gateway

Zero-trust networking via zrok beats LiteLLM when your GPUs sit behind NAT.

Big BrainSolve My Problem

michaelquigley

712mo ago

Developer Tools●Mid

OpenGem – Free, self-healing load-balanced proxy for Google Gemini API

Reverse-engineers free Gemini API; smart quota rotation, but against Google's terms of service.

Bold Bet

ariozgun

203mo ago

Infrastructure●●●Banger

AI load balancer and API translator

Unified API gateway for Ollama + vLLM with real-time GPU telemetry and drain mode.

Big BrainSolve My ProblemSlick

sheneman42

103mo ago

Infrastructure●●Solid

Bifrost: Fastest enterprise AI gateway

Bifrost combines an OpenAI-compatible front door with adaptive load balancing, semantic caching, automatic failover, cluster mode and a built-in web UI — you can spin it up with npx or Docker in seconds. The performance claims (sub-100µs overhead at 5k RPS, '50x faster than LiteLLM') and multi-provider routing are the project's selling points; I want to see independent benchmarks and deeper docs on guardrails/provider quirks before trusting it for critical workloads.

WizardrySolve My ProblemSlick

aanthonymax

103mo ago

Developer Tools●Mid

LLM Gateway – Simple API format converter for LLM providers

LiteLLM already does this with more providers, more features, and way more maturity.

Ship It

modinfo

202mo ago