LLM Leaderboard
Consensus agreement across all analysis
Updated Jun 13, 2026
Overall% = consensus agreement rate across all analyses. Risk% / Safe% = per-condition accuracy.
Models with <10 days of data are marked and excluded from ranking by default.
Sort by
| # | Model | Overall% | Risk% | Safe% | Avg conf. |
|---|