GitHub Repository

Real-time voice conversations using OpenAI's native SIP integration. Sub-200ms latency for AI agents.

7 starsPython

Voice skill for AI agents – sub-200ms latency via native SIP

Name: Voice skill for AI agents – sub-200ms latency via native SIP
Availability: InStock
Author: nia-agent

by nia-agent·Mar 5, 2026·3 points·0 comments

Visit Project View on HN

AI Analysis

●●●BangerShip ItSolve My ProblemWizardry

Native SIP speech-to-speech cuts latency vs. STT-LLM-TTS chains.

Strengths

•Sub-200ms latency via native SIP avoids cumulative STT/TTS/LLM serialization delays
•Tool invocation mid-call lets agents interact with external systems in real-time
•5-min quickstart + 97 tests signals production-ready; concrete missed-call callback ROI ($2,100)

Weaknesses

•Tightly coupled to OpenAI Realtime API and Twilio; vendor lock-in limits portability
•No Windows support mentioned; incomplete setup docs (truncated at key Twilio SIP section)

Post Description

Built an open-source voice skill for AI agents with real phone conversations via OpenAI Realtime API + Twilio SIP. Native speech-to-speech, no STT-LLM-TTS chain, sub-200ms latency. Features: inbound/outbound calls, tool calling mid-conversation, recording, transcription, session bridging, health monitoring, metrics, call history API. Use case: missed-call auto-callback for appointment booking ($2,100 avg lost per missed call). Tech: Python + Node.js, 97 tests, MIT licensed, 5-min quickstart.