GitHub Repository

Tiny, stateless Go router that dispatches OpenAI-compatible requests to single-model vLLM and sglang backends with zero external dependencies

11 starsNix

LLMhop – A tiny, stateless router for LLMs with a NixOS module

Name: LLMhop – A tiny, stateless router for LLMs with a NixOS module
Availability: InStock
Author: mlenz

by mlenz·Jun 4, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●SolidNiche GemCozy

Pure Go, zero dependencies router for vLLM and sglang — clean solution to a real infra pain.

Strengths

•Pure Go with zero external dependencies — single binary, no CGO, no third-party packages.
•NixOS module with Quadlet/Podman integration shows thoughtful deployment engineering.
•Stateless design means it sits safely behind any load balancer without coordination.

Weaknesses

•Niche audience — only matters if you're running multiple single-model inference servers.
•Two stars on GitHub suggests limited real-world testing so far.

Post Description

LLMhop is a tiny stateless proxy for LLM inference servers. It tackles an issue I faced when trying to serve more than one local LLM at once which is not natively supported by vLLM. The LLMhop binary inspects the model field of the request and routes it to the correct backend service with optional handling of authentication. In addition, it contains a NixOS module to run llama.cpp, vLLM, and sglang via Quadlet/Podman and auto-register with the proxy.