Normalized Scores
LLMSnare Arena
Follow the live benchmark timeline and the recent benchmark changes in one place.
[ BENCHMARK // LIVE ARENA ]
Live benchmark
See the ongoing benchmark results for the mainstream models.
Facet
Compare models
Pick specific models to compare. Up to 5.
[ UPDATES // RECENT CHANGES ]
Update log
Recent changes that affect how this benchmark should be read.
2026-04-19 23:00 UTC
Changed the update cadence to every 3 hours and removed Claude Sonnet 4.5 and Claude Opus 4.5.
2026-04-17 02:00 UTC
Added Claude Opus 4.7, removed OpenAI GPT 4.1 and Claude Haiku 4.5, added the new
search_text tool, and raised the difficulty.2026-04-12
Added two models: Google’s Gemma 4 31B and Xiaomi’s Mimo v2 Pro.
2026-04-11
Raised the difficulty because too many models were hitting full score.
2026-04-10
Launched the live LLMSnare benchmark.