Back to browse
GitHub Repository

Find what your AI agent gets wrong — before you have a rubric. Qualitative eval for PMs.

3 starsPython

GEDD – Find what your AI agent gets wrong (before your users do)

by balasvce19855·May 31, 2026·2 points·0 comments

AI Analysis

●●SolidBig BrainSolve My Problem

Grounded theory methodology for AI evals before you have rubrics.

Strengths
  • Discovers failure modes you didn't anticipate through open coding
  • Two-persona workflow separates domain expertise from ML engineering
  • Tests against real deployed endpoint with latency and IAM included
Weaknesses
  • AWS sample repo, not a standalone product with ongoing support
  • Tied to Bedrock AgentCore and SageMaker, limits portability
Category
Target Audience

Product managers and ML engineers evaluating AI agents

Similar To

LangSmith · Arize Phoenix · Braintrust

Similar Projects