OpenCode Benchmark Dashboard
Benchmarks OpenCode models locally, but lacks preloaded datasets and only works with configured OpenAI-compatible APIs.

The site weaponizes a compact set of benchmarks — throughput, RAM, cold-start, F1 score and install footprint — and even publishes raw JSON on GitHub, which makes it immediately useful for teams comparing ingestion options. Kreuzberg's Rust implementation posts jaw-dropping numbers against common tools; that's interesting, but the page leaves out crucial reproducibility details (datasets, seed runs, environment configs) you'd want before trusting the magnitude of those gaps.
Backend developers, data engineers, NLP/ML engineers and SREs evaluating document ingestion and parsing libraries
Benchmarks OpenCode models locally, but lacks preloaded datasets and only works with configured OpenAI-compatible APIs.
Normalizes disparate benchmarks into a single IQ score, but relies on opaque calibration curves.
jsPerf has owned JavaScript benchmarking for 15 years — this is a cleaner clone without differentiation.
Site is currently blocked behind Cloudflare; cannot assess project functionality or merit.
7,560 runs proving cheaper models beat expensive ones on production OCR tasks.
Compresses long-memory evaluation into three questions testing recall, updates, and abstention.