NUA an agent that tests for product correctness

Name: NUA an agent that tests for product correctness
Availability: InStock
Author: Paster335

by Paster335·Jun 2, 2026·8 points·4 comments

Visit Project View on HN

AI Analysis

●●SolidSolve My ProblemNiche Gem

Adversarial AI agent turns SEC rules into automated compliance tests.

Strengths

•Adversarial testing runs actual violation scenarios through your product like a real user.
•Live compliance reports update automatically when rules change or your product ships.
•Specific SEC and FINRA focus shows genuine domain expertise, not generic AI wrapper.

Weaknesses

•Narrow audience limited to regtech and fintech companies with compliance requirements.
•Unclear how agent handles complex product logic beyond surface-level UI interactions.

Post Description

We’ve been using background Claude loops a lot recently, and we would wake up to PRs that didn’t solve the problem we wanted, made on assumptions that were wrong. Furthermore, the tests that the agents wrote were usually tautological, and didn’t test for intent. We wanted an agent that took all the context a company has, and writes tests that check for product correctness as well.

For example, we work in reg tech, so bugs aren’t always technical. What we often see is things like insider trading alerts that should’ve fired that didn’t. We wanted an agent that turns laws and regulations into tests.

For now, users can upload PDF, MD, TXT, and DOCX files, but we’re planning integrations like Slack, Notion, Linear, and Zoom in the future.

We’re early on, so we would love to know what you all think!

Similar Projects

Security●●●Banger

Agent Red Team – Adversarial testing for AI agents before production

Tests agent actions and tool calls, not just output, with deterministic code validation.

Solve My ProblemSlickBold Bet

LukataSolutions

312mo ago

Security●●Solid

Khaos – Every AI agent I tested broke in under 30 seconds

Auto-patching LLM calls to inject faults and log telemetry is a neat technical trick that lets you fuzz real agent runs without changing your stack. The repo ships six intentionally vulnerable example agents and a CLI (discover/run/ci) with eval packs for security and resilience, so you can reproduce attacks and gate releases. It feels like an early, practical toolkit that fills a gap in agent security testing — adoption and more community-playbooks will determine how far it goes.

Big BrainNiche Gem

exordex

114mo ago

Infrastructure●●●Banger