GitHub Repository

Reproducible benchmark results for Compresh: episodic-memory recall + compression fidelity, scored with each benchmark's own method and an independent judge.

0 starsPython

We matched full-context recall on ~1% of the tokens (open benchmark)

Name: We matched full-context recall on ~1% of the tokens (open benchmark)
Availability: InStock
Author: compresh

by compresh·Jun 23, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainDark Horse

Matches full-context recall at 1% tokens, but chronological ordering still lags behind naive RAG.

Strengths

•verify.py reproduces results with pure stdlib, no API keys or network required
•Transparent about weaknesses like chronological ordering (0.44 vs 0.65 for RAG)
•Independent judge scoring makes results comparable across different systems

Weaknesses

•Benchmark repo only, actual Compresh system lives at external compresh.sh domain
•EpBench is narrow scope; broader T-bench and compression fidelity results still coming