GreyFox – Free self-hosted AI proxy, token quotas, and local cache

Name: GreyFox – Free self-hosted AI proxy, token quotas, and local cache
Availability: InStock
Author: SkilfulFox

by SkilfulFox·Jun 21, 2026·3 points·0 comments

Visit Project View on HN

AI Analysis

●MidSolve My Problem

Yet another AI proxy when LiteLLM and Helicone already dominate.

Strengths

•SQLite storage means no external database dependencies to manage
•Mock mode enables zero-cost onboarding and demos without live keys
•Exact response cache reduces redundant token spend on repeated queries

Weaknesses

•5-user limit makes it unusable for most actual teams
•No differentiation from established players like LiteLLM or Portkey

Similar Projects

Developer Tools●●Solid

Aisbf, a self-hostable OpenAI-compatible AI proxy/router

LiteLLM and OpenRouter already solve multi-provider routing better and have production users.

Big BrainNiche Gem

nextime

241mo ago

Infrastructure●●Solid

UnifyRoute – Self-hosted OpenAI-compatible LLM gateway with failover

Drop-in OpenAI API gateway with failover—LiteLLM does this but this has a dashboard.

Solve My ProblemSlick

unifyroute

113mo ago

Developer Tools●●Solid

Docker-whisper: Self-hosted Whisper speech-to-text server (OpenAI API)

One-command Docker deploy from hwdsl2, who maintains trusted WireGuard and OpenVPN images.

CozySolve My Problem

hwdsl2

612mo ago

Developer Tools●●Solid

TokenMeter – Open-source observability layer for LLM token costs

Proxying every LLM call to log tokens is the right kind of blunt instrument — you get per-developer, per-model cost telemetry immediately. Smart routing and the built-in semantic cache (claims 45–80% savings) are the most useful ideas here, but the default SQLite backend and admin/admin creds scream MVP rather than production-ready scale.

Solve My ProblemNiche Gem

Mohit8880

134mo ago

Infrastructure●●Solid

Forja – Remote Docker Builders on AWS

Ephemeral EC2 builders with mTLS beats GitHub Actions for cost control.

Solve My ProblemShip It

noqcks

123mo ago

Infrastructure●Mid

Sentinel – Go LLM Proxy with 13ms Semantic Cache and PII Scrubbing

Multi-model LLM router with semantic cache, but caching+fallback already exist (Anthropic, LangSmith, Unify).

SlickCrowd Pleaser

ChipShotz

113mo ago