Grunden – Frontier AI inference hosted in Sweden, OpenAI-compatible
Yet another OpenAI wrapper, but hosted in Sweden for EU compliance.
Zero-friction local server that loads any GGUF model behind an OpenAI-compatible REST API. No venv, no cloud. Wraps llama.cpp with a thin FastAPI surface supporting streaming and function calling.
Ollama and llama.cpp server already do this with more maturity and model support.
Developers testing local LLMs
Ollama · llama.cpp · LM Studio
Existing OpenAI-compatible servers often require Docker, complex configuration files, or GPU support.
The gap between "I have a .gguf file" and "I have a working API endpoint" is wider than it should be.
A simple CLI tool to serve GGUF models as an endpoint: gguf-serve
To cut this short, we asked Neo to build gguf-serve.
Point it at any .gguf file, run the server, and immediately get OpenAI-compatible endpoints that work with any client library or tool that speaks the OpenAI API format.
Yet another OpenAI wrapper, but hosted in Sweden for EU compliance.
Yet another OpenAI-compatible gateway when LiteLLM and OpenRouter already exist.
DeepSeek and Qwen access without Chinese phone number — Together AI for China models.
Yet another model runner when Ollama already dominates this space.
Full LLM pipeline in one window when LM Studio only does inference.
Another async Python AI agent framework in a saturated category with no novel differentiation.