Back to browse
GitHub Repository

Reproducible benchmark results for Compresh: episodic-memory recall + compression fidelity, scored with each benchmark's own method and an independent judge.

0 starsPython

We matched full-context recall on ~1% of the tokens (open benchmark)

by compresh·Jun 23, 2026·2 points·0 comments

AI Analysis

●●SolidBig BrainDark Horse

Matches full-context recall at 1% tokens, but chronological ordering still lags behind naive RAG.

Strengths
  • verify.py reproduces results with pure stdlib, no API keys or network required
  • Transparent about weaknesses like chronological ordering (0.44 vs 0.65 for RAG)
  • Independent judge scoring makes results comparable across different systems
Weaknesses
  • Benchmark repo only, actual Compresh system lives at external compresh.sh domain
  • EpBench is narrow scope; broader T-bench and compression fidelity results still coming
Category
Target Audience

ML engineers and researchers working on LLM context optimization

Similar To

Needle In A Haystack · RAGAS · LongBench

Similar Projects

AI/ML●●Solid

Entroly – Compress codebase context for LLMs by 78% using Rust

Entropy-based context compression beats naive token stuffing, but the category is crowded.

Big BrainNiche Gem
savetokens
103mo ago
Developer Tools●●●Banger

Engram adds universal context spine for AI coding IDEs

Heuristic DOM matching cuts LLM calls, saving 89% on tokens compared to naive context stuffing.

Big BrainSolve My Problem
NickCirv
101mo ago