OpenCode Benchmark Dashboard
Benchmarks OpenCode models locally, but lacks preloaded datasets and only works with configured OpenAI-compatible APIs.

Clean data viz but Our World in Data already does this with more rigor.
Policy researchers, journalists, EU advocates
Our World in Data · Gapminder · World Bank Data
Benchmarks OpenCode models locally, but lacks preloaded datasets and only works with configured OpenAI-compatible APIs.
PowerBI-lite for tech comps, but Crunchbase and PitchBook do this better with more data.
Streams evals from a tiny Python client into a shared dashboard and lets you run parameter sweeps and compare up to six configurations with radar/bar charts and scorecards — exactly the sort of tooling that stops results getting lost in notebooks. Useful, pragmatic product for teams who repeatedly evaluate models, but it's competing with general observability/experiment trackers (W&B, Neptune) and will need strong integrations and metric flexibility to stand out.
The site weaponizes a compact set of benchmarks — throughput, RAM, cold-start, F1 score and install footprint — and even publishes raw JSON on GitHub, which makes it immediately useful for teams comparing ingestion options. Kreuzberg's Rust implementation posts jaw-dropping numbers against common tools; that's interesting, but the page leaves out crucial reproducibility details (datasets, seed runs, environment configs) you'd want before trusting the magnitude of those gaps.
Real-time GPU pricing aggregator, but existing tools like Crusoe Dashboard already solve this.
Simple household survey with local-only storage and math puzzle gate.