Back to browse
GitHub Repository
7 starsTypeScript

A/B test your own VLMs for document parsing (Self-hosted Arena)

by matthew624·Feb 19, 2026·1 point·0 comments

AI Analysis

●●SolidSolve My ProblemSlickNiche Gem

Document parsing A/B test arena with ELO ranking—niche but real alternative to OCR Arena.

Strengths
  • Fair blind battle design with weighted matchmaking ensures underrepresented models get tested equally.
  • Real-time token streaming via SSE + Markdown/LaTeX rendering makes result comparison immediate and readable.
  • Docker one-click deploy + multi-provider support (Anthropic, OpenAI, Ollama, etc.) lowers friction significantly.
Weaknesses
  • Solves a real pain point but audience is narrow: teams with self-hosted VLMs and private document sensitivity.
  • 90% Claude Code attribution raises questions about custom architecture vs. scaffolded boilerplate.
Target Audience

ML engineers evaluating custom document parsing models, teams comparing VLMs privately

Similar To

OCR Arena · Hugging Face Model Arena

Similar Projects