GitHub Repository

Legal Action Boundary Eval (LABE): public proxy eval for legal AI workflows at the action boundary

3 starsPython

Legal Action Boundary Eval for agentic legal workflows

Name: Legal Action Boundary Eval for agentic legal workflows
Availability: InStock
Author: kankouadio_vx

by kankouadio_vx·Apr 22, 2026·2 points·2 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainNiche Gem

Evaluates AI at the action boundary, not just understanding quality—most benchmarks stop too early.

Strengths

•Action-boundary focus catches failures quality evals miss
•Dual-language suite with identical scenarios in TypeScript and Python
•Public artifacts and reproducible methodology with raw results

Weaknesses

•Promotes VerifiedX product throughout—feels like marketing dressed as open source
•Narrow legal AI audience limits broader adoption and community contribution

Post Description

We published LABE, a public benchmark for legal AI at the exact point where a system is about to take a real high-impact action.

Current result:

baseline executed 18 unjustified high-impact action points with VerifiedX that dropped to 0 false blocks in the current suite: 0 surviving-goal completion improved from 41.7% to 100% Same harness, same prompts, same playbooks, baseline vs VerifiedX.

Legal is the first public instance. The same method applies to support, healthcare RCM, procurement, and finance too.

Repo, methodology, and raw artifacts are public: https://github.com/bigkan8/legal-action-boundary-eval