🚀We launch Evaluation Cards (beta): a centralized public record of AI evaluation results 🚀
Not another leaderboard. Every score comes with who ran it, the settings they used, what the benchmark tests and the other results reported for the same model, side by side. 🧵👇

