Run AI chat, image gen, vision, and voice offline on your Mac

Name: Run AI chat, image gen, vision, and voice offline on your Mac
Availability: InStock
Author: olie_h

by olie_h·Jun 29, 2026·10 points·3 comments

Visit Project View on HN

AI Analysis

●●●BangerZero to OneSolve My ProblemDark Horse

Local OpenAI endpoint at 127.0.0.1:7878 beats Ollama with full studio and MCP connectors.

Strengths

•OpenAI-compatible local endpoint means zero code changes to existing AI clients and workflows
•Comprehensive suite: chat, image gen, vision, voice, RAG, and MCP connectors in one package
•Actually ships on App Store and Google Play with real benchmarks, not just a README

Weaknesses

•Local LLM space getting crowded with Ollama, LM Studio, and Jan already established
•Desktop and mobile apps may have feature parity gaps as the ecosystem grows

Post Description

Your Mac can run AI that holds its own against cloud models for the everyday stuff: chatting, making images, reading documents, transcribing voice. The hardware got there a while ago. The software to actually use it locally mostly didn't, so I built Off Grid.

Download a model and it all runs on your machine. Ask it something on a flight with no wifi. Summarize a confidential document that never leaves your laptop. Run a hundred image generations in a loop and pay nothing, because it's your own GPU doing the work. Swap your paid dictation app for local Whisper. Talk through a coding problem at 1am with no meter running.

Developers: it exposes a local OpenAI-compatible endpoint at 127.0.0.1:7878/v1. Point Claude Code or any OpenAI client at it and run loops against local models at zero token cost. You can run it headless as just the gateway, or open the full studio with chat, RAG over your own docs, live artifacts, and MCP connectors.

Under the hood it's llama.cpp for text, stable-diffusion.cpp for images (Metal, ships SDXL-Lightning and Z-Image-Turbo), Whisper for voice, Kokoro for speech. Encrypted local DB, LanceDB for vectors, Electron and React.

Open core is AGPL. Signed macOS build on Releases, Windows coming.

Inference and your data stay on the machine; once a model's downloaded the open core makes no network calls.