ICML 2026 spotlight paper estimates tail risks in LLMs
A spotlight paper at ICML 2026 presents methods to estimate tail risks in large language models. The work targets rare high-impact failure modes missed by standard pre-deployment evaluations. It reduces evaluation costs through fewer samples while detecting corner cases that appear only after real-world deployment. The findings indicate that cleared models can still generate problematic outputs under broader conditions and underscore the need for ongoing post-deployment monitoring of LLMs.
Assessing model risk is expensive, check out paper on how to sample less to find corner cases.
It’s deployment time! You’ve done the pre-deployment evals. You THINK your model is safe, so you ship it 🚀 🚨 After deployment, reports of misbehavior start trickling in What happened?? How could you have caught it?? 🤔 @icmlconf 2026 Spotlight! 🧵