Back to browse
Flight Risk: Can you break an AI agent?

Flight Risk: Can you break an AI agent?

by tetrakai·Apr 21, 2026·3 points·0 comments

AI Analysis

●●SolidNiche GemRabbit Hole

Six escalating rounds force deeper prompt injection tactics than standard static levels.

Strengths
  • Gamifies abstract prompt injection risks into tangible, hands-on challenges.
  • Escalating difficulty curve prevents simple ignore instructions brute forcing.
  • Immediate feedback helps engineers understand defensive gaps quickly.
Weaknesses
  • Lakera's Gandalf already dominates the AI security game niche.
  • Unclear if smarter AI is dynamic adaptation or just scripted levels.
Category
Target Audience

Security engineers, AI developers

Similar To

Gandalf by Lakera · GPT Prompt Engineer

Post Description

I built a security game that lets you try to break an AI support agent.

I work on security engineering, and it's incredibly hard to try to defend against an attack that you don't know how to perform yourself. There's also next to nowhere to improve your skills. I'd heard all about fooling AI agents with just "IGNORE ALL PREVIOUS INSTRUCTIONS", but I'd never actually put that into practice, and it turns out it's harder than you'd expect!

Just like knowing basic security skills is important for all software engineers, anyone working with AI should know what prompt injection looks like, and should be thinking about how to prevent it. Flight Risk lets you practice your AI agent manipulation skills: it's got your standard prompt injection and social engineering, but more than that too, each a real vulnerability.

Think you could crack it? Every engineer I've given it to has been surprised by the challenge! You can use the hints, but they affect your score ;)

Give it a try, and let me know how you do!

Similar Projects