Back to browse
GitHub Repository

An open-source API Gateway & background daemon designed to queue inference surges and scale cloud GPUs down to zero when idle.

1 starsPython

ZeroGate – API gateway to scale cloud GPUs to zero when idle

by ngarner·Jun 26, 2026·2 points·0 comments

AI Analysis

MidShip ItBold Bet

GPU autoscaling is solved by Kubernetes; this adds complexity without clear novelty.

Strengths
  • Mock mode tests orchestration logic without needing any actual GPU hardware accounts.
  • Integrated billing ledger tracks token-level utilization metrics alongside real-time infrastructure costs.
  • Docker Compose setup enables local evaluation of the full infrastructure stack immediately.
Weaknesses
  • Kubernetes KEDA already handles scale-to-zero autoscaling for any containerized workload universally.
  • Buzzword-heavy README obscures actual technical differentiation from existing established infrastructure tools.
Target Audience

ML engineers managing multi-tenant inference pipelines

Similar To

Kubernetes KEDA · Skypilot · Modal

Similar Projects

Infrastructure●●Solid

LLM-Gateway – Zero-Trust LLM Gateway

Zero-trust networking via zrok beats LiteLLM when your GPUs sit behind NAT.

Big BrainSolve My Problem
michaelquigley
713mo ago