Back to browse
GitHub Repository

Native macOS menu bar app for realtime dictation with optional LLM polishing. Connects to any OpenAI Realtime-compatible backend — fully local on Apple Silicon with voxmlx + mlx-lm.

38 starsSwift

Localvoxtral – Local real-time dictation on macOS with streaming STT

by T0mSIlver·Feb 24, 2026·1 point·2 comments

AI Analysis

●●SolidWizardryShip It

Streaming speech-to-text on-device beats Whisper's wait-for-silence UX pattern.

Strengths
  • Voxtral Realtime architecture streams words mid-utterance, not post-speech like Whisper clones.
  • Fully local inference path on M1 Pro + voxmlx fork with WebSocket server and memory optimizations.
  • Native Swift menu bar app with global shortcut, auto-paste, and microphone selection—feels polished.
Weaknesses
  • Currently macOS-only; no Windows or Linux support limits audience.
  • Depends on relatively early Mistral model; unclear real-world accuracy vs. commercial whisper+post-processing.
Target Audience

macOS users needing low-latency dictation; developers building voice interfaces

Similar To

Whisper (OpenAI) · Superwhisper · Monologue

Post Description

I built a native macOS menu bar app for real-time dictation that can run fully on-device.

Most dictation tools, even local ones, use Whisper or similar offline models: you record, then wait for the transcript. Localvoxtral uses Mistral's Voxtral Realtime, one of the first open-source speech models with a natively streaming architecture. Words appear as you speak, not after you stop. It feels closer to someone typing along as you talk.

Press a shortcut, speak, and text gets typed directly into whatever app you're in. No cloud, no subscription, no data leaving your machine.

Two backend options:

voxmlx on Apple Silicon: I forked voxmlx to add a WebSocket server and memory optimizations. Runs a 4-bit quantized model on an M1 Pro. Audio and inference stay fully on-device. vLLM on NVIDIA GPU: tested on an RTX 3090, noticeably faster.

The app is native Swift (~97%), lives in the menu bar, and stays out of your way. Configurable shortcut, mic selection, auto-paste. GitHub: https://github.com/T0mSIlver/localvoxtral

Pre-built DMG available in Releases

Similar Projects

AI/ML●●Solid

MoonshineFlow

Local Moonshine transcription streams text to any app, no audio leaves your Mac.

CozySolve My Problem
jrm-veris
602mo ago