Classic Video Poker v2
Second version of HN front-pager adds real persistence and grinder leaderboards.

Duplicating transformer layers boosts benchmark scores without a single step of training.
AI researchers, ML engineers
HuggingFace Transformers · LLM Merger
Second version of HN front-pager adds real persistence and grinder leaderboards.
LLM-playable Tron game via MCP with real progression—niche but genuinely fun.
LLM model showdown in snake, but the novelty wears off after five minutes of watching.
Ancient Rome Q&A benchmark shows 81pp accuracy lift, but lacks adversarial defense evidence.
Civilization matches expose model divergence that static benchmarks miss—but it's a spectacle, not a measurement.
LLMs playing poker live is entertaining, but it's a novelty demo without depth or staying power for serious users.