Back to browse
GitHub Repository

Personal-finance assistant benchmark — evaluate real finance products against synthetic user personas

0 starsTypeScript

TreasuryBench – an open benchmark for personal-finance AI advice

by juneadkhan·Jun 25, 2026·3 points·1 comment

AI Analysis

●●SolidBig BrainBold Bet

Factual error caps prevent hallucinated finance advice from scoring well, which matters.

Strengths
  • Dangerous error tracking flags financially harmful misinformation separately from minor mistakes
  • Table-grounded factual verification prevents prose quality from masking wrong numbers
  • 81 tasks across 12 domains covers more ground than typical single-metric benchmarks
Weaknesses
  • Treasury product evaluated alongside competitors creates obvious conflict of interest
  • Zero community stars suggests limited independent validation of methodology
Category
Target Audience

Fintech developers and personal finance app builders

Similar To

FinanceBench · FinQA · ConvFinQA

Similar Projects