KokoClone – Zero-shot voice cloning using Kokoro TTS
Kokoro voice cloning with multilingual support, but voice cloning itself is crowded.
Another voice cloning platform when ElevenLabs already dominates this space.
Content creators, developers building voice features
ElevenLabs · PlayHT · Murf
Kokoro voice cloning with multilingual support, but voice cloning itself is crowded.
Shrinks the usual TTS bloat into a 16MB Electron-alternative wrapper while still letting you clone voices from a short sample and 'design' voices from text prompts. It handles model downloads for you, supports batch exports and macOS auto-updates — smart product trade-offs. Caveat: the app binary is tiny, but the underlying TTS models are downloaded on demand, so expect large model pulls behind the scenes.
This repo bundles a complete local audio loop — client captures audio, backend transcribes with Parakeet, queries a quantized Mistral LLM via Ollama, then renders speech with Kokoro or Qwen3-TTS for cloning — and reports ~1s round-trip on an RTX5070. It’s a practical, take-it-home demo for running privacy-first voice agents, though it’s still a demo: requires specific tooling (Ollama, GPU headroom), has obvious TODOs (VAD, better warmup for cloning), and isn’t reinventing the architecture.
Barebones TTS wrapper with no unique features over browser native APIs.
Yet another AI text-to-speech wrapper; Eleven Labs, Google Cloud TTS exist.
The repo actually solves the messy plumbing of live voice agents: modular ASR→LLM→TTS adapters plus an optional PersonaPlex speech-to-speech path, per-agent env overrides, and a Playwright-driven Jitsi bot for room joining. It's a useful MVP for anyone prototyping AI co-hosts, though mixing backends is still manual and PersonaPlex demands extra infra, so it's more pragmatic experiment than turnkey product.