Imagedojo.ai – Blind arena for Google, OpenAI, and xAI image generators
LMSYS Arena for images, but the leaderboard lacks volume—359 images doesn't drive statistical confidence.
Document parsing A/B test arena with ELO ranking—niche but real alternative to OCR Arena.
ML engineers evaluating custom document parsing models, teams comparing VLMs privately
OCR Arena · Hugging Face Model Arena
LMSYS Arena for images, but the leaderboard lacks volume—359 images doesn't drive statistical confidence.
First benchmark measuring semantic correctness over text similarity for document parsing.
LlamaIndex open-sources their parser core, but LlamaParse cloud still handles complex layouts.
Tests live in README as plain English; clever partial parsing eliminates Gherkin boilerplate overhead.
Per-span confidence scores let you review uncertain OCR before trusting 200k-page runs.
Rigorous benchmark methodology, but it's research not a tool you can use.