GitHub Repository

Find what your AI agent gets wrong — before you have a rubric. Qualitative eval for PMs.

3 starsPython

GEDD – Find what your AI agent gets wrong (before your users do)

Name: GEDD – Find what your AI agent gets wrong (before your users do)
Availability: InStock
Author: balasvce19855

by balasvce19855·May 31, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainSolve My Problem

Grounded theory methodology for AI evals before you have rubrics.

Strengths

•Discovers failure modes you didn't anticipate through open coding
•Two-persona workflow separates domain expertise from ML engineering
•Tests against real deployed endpoint with latency and IAM included

Weaknesses

•AWS sample repo, not a standalone product with ongoing support
•Tied to Bedrock AgentCore and SageMaker, limits portability

Similar Projects

AI/ML●Mid

A Claude Code skill that scopes problems like Peter Naur

Naur's 1985 theory applied to AI agents, but it's just a prompt template.

Big BrainNiche Gem

spinchange

2014d ago

Data●●Solid

A skill to audit your dbt project for what an AI agent will get wrong

Catches AI-breaking dbt issues like conflicting revenue metrics and YAML/SQL mismatches.

Niche GemSolve My Problem

matthieu_bl

302d ago

Security●●Solid

Agent Skill Based on "Open Source Security at Astral"

Automates Astral's security framework into an agent skill that produces HTML reports.

Niche GemBig Brain

ramoz

302mo ago

Developer Tools●Mid

Agent-evals – Claude skill to build your own evals

Claude Skill for agent evals, but LangSmith and Arize already own this.

Solve My Problem

sauercrowd

911mo ago

Productivity●●Solid

Skill for structured deep research with Claude Code and Obsidian

Zettelkasten automation for Obsidian—compounds research sessions, fills gaps automatically.

Niche GemShip ItBig Brain

alvdef

113mo ago

AI/ML●Mid

Skills for spec-driven AI software development

Curated skill collection for spec-driven AI development, competing with other prompt libraries.

Niche Gem

puristajs

2016d ago