Back to browse
GitHub Repository

The addressee layer for voice agents: only speech meant for your agent reaches your STT, LLM, or TTS, on any stack.

3 starsPython

We Help Voice AI Handle Group Conversations

by betweenDan·Jun 23, 2026·4 points·0 comments

AI Analysis

●●●BangerSolve My ProblemBig Brain

Solves the multi-human voice AI problem without wake words using attention pattern detection.

Strengths
  • Addressee detection without wake words means less STT cost and fewer accidental triggers.
  • Integrates with Pipecat, LiveKit, ElevenLabs, and Twilio — drops into existing voice stacks.
  • Video+audio and audio-only models handle different hardware configurations.
Weaknesses
  • Only 3 GitHub stars at submission — unproven in production deployments.
  • Every human interacts differently, making consistent attention detection inherently difficult.
Category
Target Audience

Voice AI developers building multi-human or multi-agent conversational systems

Similar To

Picovoice · Porcupine · Rhino

Post Description

Hey folks. We built SAA (Selective Auditory Attention) after trying to find ways to make a good experience with multiple robots/multiple agents. What typically ended up happening is they'd never stop talking.

This is an SDK you can put before your STT. It lets you know when your device is being spoken to or not without a wakeword. You can use it for: -Single AI, Multi human -Multi AI, Single human -Multi AI, Multi human (we recommend also adding a wakeword on top for a better system)

There are two models. One that is video + audio and one that is just audio. The way it overall works is that it looks for shifts in attention patterns (body language changes, vocal patterns) to work. It's a tough problem to nail as every human being is different in how they interact with people/devices.

Let me know how it is!

Similar Projects