Back to browse
GitHub Repository

骁龙 8 Gen2 本地多模态语音 AI 助手

0 starsKotlin

Qiaohu – offline multimodal voice assistant on Snapdragon 8 Gen 2

by donge·Apr 9, 2026·2 points·0 comments

AI Analysis

●●●BangerWizardryNiche GemZero to One

Full voice assistant pipeline with barge-in running entirely offline on Snapdragon GPU.

Strengths
  • Barge-in capability interrupts TTS mid-sentence, requiring precise audio pipeline coordination.
  • Gemma 4 2B via LiteRT-LM on mobile GPU with no cloud dependency whatsoever.
  • Complete build instructions with exact library versions and model download paths.
Weaknesses
  • Chinese-language only limits audience; English ASR/TTS would broaden appeal significantly.
  • 2.6GB model download on first run is substantial for mobile data constraints.
Category
Target Audience

Android developers and users wanting private, offline voice interaction

Similar To

Off Grid · Sherpa-ONNX

Post Description

Built a fully offline Chinese voice assistant that runs entirely on-device (no server, no cloud).

Pipeline: VAD (Silero, 16kHz) → LLM (Gemma 4 2B via LiteRT-LM on Snapdragon GPU) → TTS (sherpa-onnx + matcha-icefall-zh-baker, 22050Hz) → speaker. Barge-in interrupts TTS mid-sentence.

Demo video in the README. Code and full setup instructions on GitHub.

Similar Projects

AI/ML●●Solid

Local Voice Assistant

This repo bundles a complete local audio loop — client captures audio, backend transcribes with Parakeet, queries a quantized Mistral LLM via Ollama, then renders speech with Kokoro or Qwen3-TTS for cloning — and reports ~1s round-trip on an RTX5070. It’s a practical, take-it-home demo for running privacy-first voice agents, though it’s still a demo: requires specific tooling (Ollama, GPU headroom), has obvious TODOs (VAD, better warmup for cloning), and isn’t reinventing the architecture.

WizardryNiche Gem
armcat
203mo ago