3d ago

Mingqian Zheng releases CARRYONBENCH benchmark for LLM clarification

68742397.3K

——0——

Mingqian Zheng released CARRYONBENCH, an interactive benchmark of 5,970 simulated conversations across 14 models. It measures whether large language models revise initial refusals of ambiguous but benign queries once users clarify intent. Single-turn fulfillment rates after clarification range from 10.5 percent to 37.6 percent. The evaluation surfaces recurring failure modes including utility lock-in that blocks recovery and unsafe revisions that weaken the original refusal.

Original post

#303@NILOOFAR_MIRE @ELISAZMQ_ZHENG

Mingqian Zheng@ELISAZMQ_ZHENG

LLMs refuse ambiguous queries that look harmful but aren't. Can they recover once users clarify, while staying safe? Our new interactive multi-turn benchmark measures both. 🚨 Turns out: not both at once.

12:49 PM · May 13, 2026

Cluster Engagement

Engagement snapshots are unavailable for this cluster.no post metric buckets

Reposted by

#369@DAN_FRIED

#303@NILOOFAR_MIRE