2d ago

ICML 2026 spotlight paper estimates tail risks in LLMs

4112248920.2K

——0——

A spotlight paper at ICML 2026 presents methods to estimate tail risks in large language models. The work targets rare high-impact failure modes missed by standard pre-deployment evaluations. It reduces evaluation costs through fewer samples while detecting corner cases that appear only after real-world deployment. The findings indicate that cleared models can still generate problematic outputs under broader conditions and underscore the need for ongoing post-deployment monitoring of LLMs.

Original post

#621@XTIMV @RICO_ANGELL

Rico Angell@RICO_ANGELL

It’s deployment time! You’ve done the pre-deployment evals. You THINK your model is safe, so you ship it 🚀 🚨 After deployment, reports of misbehavior start trickling in What happened?? How could you have caught it?? 🤔 @icmlconf 2026 Spotlight! 🧵

12:31 PM · May 14, 2026

Cluster engagement

144 snapshots

Reposted by

#183@ERICHORVITZ

QUOTE POST

#445Zhou Yu@ZHOU_YU_AI

Assessing model risk is expensive, check out paper on how to sample less to find corner cases.

Rico Angell@rico_angell

7:31 PM · May 14, 2026 · 14.1K Views

12:43 PM · May 15, 2026 · 2.2K Views