Podvoice – Local-first CLI to turn Markdown into multi-speaker audio
Local multi-speaker TTS CLI with zero cloud dependencies beats ElevenLabs for podcast scripts.

M-Pesa payments and Kenyan English optimization in a crowded Otter.ai market.
Kenyan professionals, journalists, researchers, and content creators
Otter.ai · Rev.com · Descript
Local multi-speaker TTS CLI with zero cloud dependencies beats ElevenLabs for podcast scripts.
Multi-voice podcast generation in one command without ElevenLabs API costs or rate limits.
Echo deduplication and dual-channel audio for local meeting transcripts.
It pairs WhisperX-grade transcription (speaker diarization and word-level timestamps) with optional multi-LLM analysis — summaries, Q&A, sentiment, topics and even fact-checking — plus YouTube import and standard export formats. Being vendor-agnostic and offering fact-checking is a smart differentiator, but the space is crowded (Descript/Otter/etc.); clearer accuracy numbers, pricing, or unique workflow hooks would make this stand out.
Outputs ready-to-use Markdown with speaker diarization and timestamps, accepts Apple Podcasts/YouTube/RSS links, and can run fully locally or use ElevenLabs for higher-quality diarization. Not groundbreaking — speech-to-text pipelines already exist — but the one-command UX, RSS browsing/search flags, and explicit local-mode make it genuinely useful for folks who want tidy transcripts without wiring together multiple tools.
Transcribes overlapping speakers in a single pass without needing separate diarization steps.