Back to browse
GitHub Repository

CPU-only voice agent approximating Thinking Machines' Interaction Models demo

17 starsPython

Recreate Thinking Machines 276B voice demo with duct tape and 8B model

by mrkn1·Jun 12, 2026·1 point·0 comments

AI Analysis

●●●BangerWizardryDark HorseBig Brain

Runs Thinking Machines-style voice agent on a laptop CPU with no GPU required.

Strengths
  • Four complex behaviors work end-to-end on CPU: friend detection, translation, slouch detection, background search.
  • Single asyncio loop orchestrates webcam, mic, speaker, and network calls without GPU acceleration.
  • Honest about limitations—duct-tape orchestration, not claiming to match 276B architecture.
Weaknesses
  • Depends on external APIs (DeepInfra, Serper) for LLM inference and web search.
  • Demo replication rather than a general-purpose product with broader use cases.
Category
Target Audience

Developers interested in local AI and voice interfaces

Similar To

Open Interpreter · Voiceflow · Rhasspy

Similar Projects

AI/ML●●Solid

Local Voice Assistant

This repo bundles a complete local audio loop — client captures audio, backend transcribes with Parakeet, queries a quantized Mistral LLM via Ollama, then renders speech with Kokoro or Qwen3-TTS for cloning — and reports ~1s round-trip on an RTX5070. It’s a practical, take-it-home demo for running privacy-first voice agents, though it’s still a demo: requires specific tooling (Ollama, GPU headroom), has obvious TODOs (VAD, better warmup for cloning), and isn’t reinventing the architecture.

WizardryNiche Gem
armcat
203mo ago