Back to browse
GitHub Repository

WMB-100K — The first 100,000-turn benchmark for AI memory systems

12 starsRust

WMB-100K – Open benchmark for AI memory systems at 100K turns

by wontopos·Apr 1, 2026·2 points·0 comments

AI Analysis

●●●BangerBig BrainNiche Gem

100K-turn benchmark tests situational memory retrieval where others stop at 600.

Strengths
  • Simulates 4.3M token context window, dwarfing prior benchmarks at 50K.
  • Situational questions test retrieval relevance, not just keyword fact lookup.
  • Includes 400 false memory tests to check hallucination defense specifically.
Weaknesses
  • Niche audience limited to AI memory system researchers and builders.
  • Scoring requires LLM judges, adding cost and latency to evaluation.
Category
Target Audience

AI researchers and engineers building long-term memory systems

Similar To

LOCOMO · LongMemEval · Needle In A Haystack

Similar Projects