Back to browse
SecLens – Claude Opus scores 0.000 on auth vulns, 0.689 on SSRF

SecLens – Claude Opus scores 0.000 on auth vulns, 0.689 on SSRF

by subho007·Apr 1, 2026·4 points·0 comments

AI Analysis

●●SolidDark HorseBig BrainEye Candy

Stakeholder-weighted LLM security benchmark reveals 31-point score swings for the same model.

Strengths
  • Role-based weighting exposes hidden model weaknesses aggregate scores miss.
  • 406 CVE tasks across 12 models provides substantial empirical data.
  • Clean interactive leaderboard makes complex benchmark data immediately digestible.
Weaknesses
  • Benchmark projects are reference material, not tools with recurring utility.
  • No clear path to integrate weights into actual model selection workflows.
Category
Target Audience

Security teams and CTOs evaluating LLMs for security tasks

Similar To

SecurityBench · Hugging Face Open LLM Leaderboard · LiveBench

Similar Projects