GitHub Repository

The addressee layer for voice agents: only speech meant for your agent reaches your STT, LLM, or TTS, on any stack.

3 starsPython

We Help Voice AI Handle Group Conversations

Name: We Help Voice AI Handle Group Conversations
Availability: InStock
Author: betweenDan

by betweenDan·Jun 23, 2026·4 points·0 comments

Visit Project View on HN

AI Analysis

●●●BangerSolve My ProblemBig Brain

Solves the multi-human voice AI problem without wake words using attention pattern detection.

Strengths

•Addressee detection without wake words means less STT cost and fewer accidental triggers.
•Integrates with Pipecat, LiveKit, ElevenLabs, and Twilio — drops into existing voice stacks.
•Video+audio and audio-only models handle different hardware configurations.

Weaknesses

•Only 3 GitHub stars at submission — unproven in production deployments.
•Every human interacts differently, making consistent attention detection inherently difficult.

Post Description

Hey folks. We built SAA (Selective Auditory Attention) after trying to find ways to make a good experience with multiple robots/multiple agents. What typically ended up happening is they'd never stop talking.

This is an SDK you can put before your STT. It lets you know when your device is being spoken to or not without a wakeword. You can use it for: -Single AI, Multi human -Multi AI, Single human -Multi AI, Multi human (we recommend also adding a wakeword on top for a better system)

There are two models. One that is video + audio and one that is just audio. The way it overall works is that it looks for shifts in attention patterns (body language changes, vocal patterns) to work. It's a tough problem to nail as every human being is different in how they interact with people/devices.

Let me know how it is!