Back to browse
GitHub Repository

LLM benchmark and leaderboard for narrator-bias sycophancy, opposite-narrator contradictions, and judgment consistency.

37 stars

LLM Sycophancy Benchmark: Opposite-Narrator Contradictions

by zone411·Mar 10, 2026·3 points·0 comments

AI Analysis

●●●BangerBig BrainDark Horse

Opposite-narrator test catches models agreeing with both sides of same dispute.

Strengths
  • Strict metric counts sycophancy only when model sides with both opposing narrators
  • Live leaderboard compares Gemini, GPT, Claude, and Grok
  • Open-source repo with clear methodology and reproducible tests
Weaknesses
  • Niche audience limited to AI safety researchers
  • Benchmark scope focused only on narrator-bias contradictions
Category
Target Audience

AI researchers and ML engineers

Similar To

HELM · BigBench · LMSys Arena

Similar Projects