Teapot – A methodology for pen testing voice AI agents

Name: Teapot – A methodology for pen testing voice AI agents
Availability: InStock
Author: xmhatx

by xmhatx·Feb 18, 2026·7 points·10 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainNiche Gem

Voice-specific prompt injection framework, but testing methodology alone isn't a shipping product.

Strengths

•Identifies real attack surface unique to STT+LLM+TTS pipeline that text-based testing misses
•Systematizes attack patterns across six concrete phases (Transcription, Exploration, Attack surface, Prompt injection, Output, Tool abuse)
•Addresses genuine gap: existing OWASP LLM Top 10 assumes text-only interfaces

Weaknesses

•Published as documentation only; no open-source test harness, automation tools, or proof-of-concept code shipped
•Unclear if methodology is novel enough to warrant a full brand—borrows structure from standard pentesting (recon, attack, eval)

Post Description

Hello HN, I am Brian Cardinale, a penetration tester and security researcher at SecureCoders. We have been performing more and more AI based security assessments. We were presented a unique challenge of testing a system where the only interface was voice based, and as much as I like talking on the phone , we decided to create a test harness to facilitate the actual testing in a more systematic way. The technical test harness was the easy part, though. Creating test goals and attack strategies to help facilitate repeated and comprehensive testing became the real challenge. As such, we have been working on documenting our processes to share with the greater community and as a starting point for discussion. These systems present unique challenges where cleverness appears to be the name of the game. Such as suggesting for the agent to share its thoughts in “Inner Monologue” tags instead of “thinking” tags because those were specifically excluded in the agents prompt. Ya know, just silly things. Anyway, if reading is not your thing, I also did a walkthrough video of this methodology here: https://www.youtube.com/watch?v=XNmqCXsEc8Y

tl;dr: AI testing is tricky, we are documenting and sharing our tricks

Do you have any favorite AI jailbreak tricks?