NEW: we added 10 more entries to InferenceBench. Claude Fable 5 is the best model but not by a large margin. Also, we had to amend the main prompt, since by default Fable 5 ended up cheating according to our judge.
🎉Big updates for InferenceBench v1.0.1! Some highlights: - 10 more entries to the leaderboard, including Fable 5, Opus 4.8, Kimi 2.6, and Gemini 3.5 Flash - Re-scoring / Re-evaluation of select models
See the changes for yourself at: https://inferencebench.ai/

