Back to browse
GitHub Repository

A fast, native-protocol LLM gateway with weighted pool composition and correct billing-vs-client failure handling.

0 starsRust

Busbar – every LLM behind one URL, in a single Rust binary

by mattjackson86·Jun 5, 2026·1 point·0 comments

AI Analysis

●●●BangerSolve My ProblemSlick

Mid-request failover reroutes streaming responses before your client sees a byte.

Strengths
  • Lossless protocol translation preserves cache_control, thinking blocks, and citations.
  • 7.4MB static binary with sub-15ms cold start beats Python sidecar deployments.
  • Weighted pool composition turns model selection into config, not code changes.
Weaknesses
  • LLM gateway category already has LiteLLM, Portkey, and Helicone with funding.
  • AGPL license may limit enterprise adoption compared to MIT or Apache alternatives.
Target Audience

Teams running production LLM applications with multiple vendor dependencies

Similar To

LiteLLM · Portkey · Helicone

Post Description

I have been working on multiple projects lately involving AI endpoints (including some I run locally) and I found I needed a way to easily load balance across multiple. Sometimes my on-prem would not be able to handle to load and Id have to crank up the z.ai usage or Anthropic depending on where my credits were at the time.

One thing let to another and I ended up writing Busbar: An LLM gateway, written in Rust (I have a thing for Rust lately). You point your existing OpenAI/Anthropic/Gemini SDK at it, change the model to a pool name, and that name now load-balances across the vendors. Your client code doesn't change and never learns it even happened.

My central idea is "protocols, not providers". I implement six protocols - Anthropic, OpenAI, Gemini, Bedrock, Responses, Cohere - losslessly. You define a provider in three lines of YAML, mainly specifying the protocol that provider speaks.

Your client speaks a protocol in to Busbar and Busbar speaks a protocol out to the provider.

- Each protocol translates request and response, streamed and buffered, in both directions. Same-protocol calls pass through untouched; cross-protocol calls reconcile the awkwardness (a field one dialect requires and another makes optional).

- A circuit breaker that knows whose fault a failure is. It stops routing to a backend that's genuinely failing, but it won't penalize a model for a request that was simply too big (it retries on a larger-context model instead), and it won't blame a backend when the caller sent a bad request. A healthy model never gets pulled from rotation for something that wasn't its fault. All issues I have personally faced and wanted to fix one time in busbar vs 10x in 10 applications.

- Hand-rolled AWS implementations so I am not reliant on AWS SDK's: SigV4 and a from-scratch AWS eventstream frame decoder for Bedrock

It's 1.0.0-rc.2 — feature-complete and API-stable, with release-candidate validation underway before 1.0.0. I have been using it on my projects and its solving my problems nicely.

Solo project, AGPL-3.0. The AGPL choice is open to discussion; I know it matters for a request-path component.

Feedback very welcome, particularly on where the translation might still be lossy in edge cases. Contribution and conversation desired!

Similar Projects

Infrastructure●●Solid

LLM-Gateway – Zero-Trust LLM Gateway

Zero-trust networking via zrok beats LiteLLM when your GPUs sit behind NAT.

Big BrainSolve My Problem
michaelquigley
712mo ago
Infrastructure●●Solid

Bifrost: Fastest enterprise AI gateway

Bifrost combines an OpenAI-compatible front door with adaptive load balancing, semantic caching, automatic failover, cluster mode and a built-in web UI — you can spin it up with npx or Docker in seconds. The performance claims (sub-100µs overhead at 5k RPS, '50x faster than LiteLLM') and multi-provider routing are the project's selling points; I want to see independent benchmarks and deeper docs on guardrails/provider quirks before trusting it for critical workloads.

WizardrySolve My ProblemSlick
aanthonymax
103mo ago