Back to browse
GitHub Repository

FreeFlow - seamless speech to text in any app. Press a hotkey, dictate naturally, polished text appears in any app.

65 starsSwift

FreeFlow – Open-Source Wispr Flow

by _mrinalwadhwa_·Mar 17, 2026·4 points·4 comments

AI Analysis

●●●BangerSolve My ProblemSlickDark Horse

Racing WebSocket connections beat Wispr Flow's latency at zero per-seat cost.

Strengths
  • Racing WebSocket connections with HTTP fallback ensure sub-0.6s latency reliably.
  • Skip heuristic bypasses polish step for clean transcripts, saving 40% of processing.
  • One private server deployment serves entire team with no per-seat fees.
Weaknesses
  • Requires deploying your own server infrastructure, adding operational overhead.
  • macOS-only for now; Windows and Linux users can't use the client app.
Category
Target Audience

Developers and teams wanting private, self-hosted voice dictation

Similar To

Wispr Flow · Superwhisper · Monologue

Post Description

Hi HN!

Voice is fast-becoming my primary interface to computers and AI. I built FreeFlow because I wanted a Wispr Flow-like experience for our entire team, but customizable and private.

Press a hotkey, dictate naturally, polished text appears in any app. Ramble, use filler words, correct yourself mid-sentence. FreeFlow turns messy speech into clean writing and injects it wherever your cursor is: your messaging app, your editor, your coding agent, the terminal, email, anything.

Demo (sound on): https://github.com/build-trust/freeflow#demo-sound-on-

It's really fast. The injection feels instantaneous. In my benchmarks two thirds of dictations finish in under 0.6 seconds. To get that speed, the app streams audio to your private server over a persistent WebSocket while you speak, and a realtime speech-to-text model transcribes incrementally, so by the time you release the key the transcript is mostly done. Two independent WebSocket connections race each other, and if both fail, an HTTP batch fallback catches it. The transcript goes through a post-processing step that removes filler words and fixes grammar. About 40% of dictations are clean enough to skip this step entirely. When post-processing is needed, a fast model handles it in about 0.4 seconds.

It's designed to be taken apart and reassembled. You can swap the speech model, rewrite the prompts, add new languages, or fork the entire experience to fit how your team works. I'm hoping people will morph it into other products.

The FreeFlow service is open source. You can self-host it, but running a low-latency streaming dictation service for a team is real infrastructure work: persistent WebSocket connections, streaming routes to speech models, failover, rate limits. At a company with fifty or five hundred people, keeping that reliable is a job in itself. FreeFlow uses Autonomy to make this easy. On first launch, the macOS app deploys the service to a private server. Two minutes, no infrastructure knowledge needed. You can then invite your team. One server handles everyone, no per-seat fees. It sustains thousands of simultaneous streaming connections. In a stress test, 50 people dictating at the same time got sub-second latency with zero failures.

brew install build-trust/freeflow/freeflow

It's macOS only for now, but I plan to build for other operating systems. The two most useful contributions right now are mic compatibility data (every mic behaves differently) and prompts that improve polish quality for a specific language.

Try it, tell me how it works with your mic and your apps. What's fast, what's slow, what's broken.

GitHub: https://github.com/build-trust/freeflow

Similar Projects