Back to browse
TTSLab – Text-to-speech that runs in the browser via WebGPU

TTSLab – Text-to-speech that runs in the browser via WebGPU

by MbBrainz·Feb 24, 2026·3 points·0 comments

AI Analysis

●●●BangerWizardryShip ItSolve My Problem

Whisper + Kokoro entirely in-browser via WebGPU, no API keys or network requests.

Strengths
  • Voice Agent chains STT → LLM → TTS locally on GPU with zero network calls.
  • WebGPU + WASM stack eliminates latency and privacy concerns of cloud TTS APIs.
  • Live model comparison and hardware benchmarking built-in; caches model weights locally.
Weaknesses
  • Model roster still small (Kokoro, SpeechT5, Piper, Whisper) vs. cloud services' breadth.
  • WebGPU support is experimental; WASM fallback quality may not match cloud alternatives.
Target Audience

Developers, researchers, and product teams evaluating TTS/STT models; privacy-conscious users.

Similar To

OpenAI Whisper (browser demos) · Piper TTS · Google Colab speech notebooks

Post Description

I built TTSLab — a free, open-source tool for running text-to-speech and speech-to-text models directly in the browser using WebGPU and WASM. No API keys, no backend, no data leaves your machine.

When you open the site, you'll hear it immediately — the landing page auto-generates speech from three different sentences right in your browser, no setup required.

You can then try any model yourself: type text, hit generate, hear it instantly. Models download once and get cached locally.

The most experimental feature: a fully in-browser Voice Agent. It chains speech-to-text → LLM → text-to-speech, all running locally on your GPU via WebGPU. You can have a spoken conversation with an AI without a single network request.

Currently supported models: - TTS: Kokoro 82M, SpeechT5, Piper (VITS) - STT: Whisper Tiny, Whisper Base

Other features: - Side-by-side model comparison - Speed benchmarking on your hardware - Streaming generation for supported models

Source: https://github.com/MbBrainz/ttslab (MIT)

Feedback I'd especially like: 1. How does performance feel on your hardware? 2. What models should I add next? 3. Did the Voice Agent work for you? That's the most experimental part.

Built on top of ONNX Runtime Web (https://onnxruntime.ai) and Transformers.js — huge thanks to those communities for making in-browser ML inference possible.

Similar Projects