Built speech models for Europe, accidentally topped Open-ASR in English
Processes one hour of audio in under three seconds while topping Open ASR Leaderboard.

Dual-head Citrinet fine-tune beats running Whisper plus a separate classifier.
Developers building Mandarin speech applications needing speaker metadata
Whisper · NVIDIA NeMo · Azure Speech API
Processes one hour of audio in under three seconds while topping Open ASR Leaderboard.
Adds Qwen3 model support to WhisperX for users wanting alternatives.
Textual-criticism approach to transcript merging beats single-model Whisper on accuracy alone.
Transcribes overlapping speakers in a single pass without needing separate diarization steps.
INT4 inference engine beats llama.cpp on VRAM, but competing against established tools.
Open-source Otter.ai clone with local LLMs, but meeting transcription is crowded.