/AI11h ago

Sharon Li introduces ModeX for evaluator-free Best-of-N selection, but Stella Biderman says the clustering method is standard textbook ML

The official implementation has been released on GitHub.

3555424.8K

#207

Original post

Sharon Li@SharonYixuanLi#604inAI

Best-of-N sampling is often used to boost LLM performance, but the selection relies on external evaluators, adding cost and bias. What if you could select the best output without any external scoring at all?

Introducing ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation, accepted at #ACL2026 Main! (led by Hyeong Kyu Choi @HyeonggyuC)

💡 Our key insight: among multiple LLM generations, high-quality outputs tend to cluster together semantically. The best answer is the modal one: the generation that captures the dominant consensus.

How ModeX works: 1⃣ Build a similarity graph over N candidate generations 2⃣ Recursively apply spectral clustering via the Fiedler vector to isolate the dominant semantic cluster 3⃣ Select the centroid of that cluster as the final output

No reward models. No external evaluators. No auxiliary inference. Just the texts themselves.

📊 Results across text summarization (CNN/DailyMail), code generation (HumanEval), and math reasoning (Math-500) show ModeX consistently outperforms single-path and multi-path baselines, achieving state-of-the-art among evaluator-free methods.

We also provide theoretical justifications connecting our graph-based mode selection to kernel density estimation, grounding the approach with principled foundations.

📄 Paper: http://arxiv.org/abs/2601.02535 💻 Code: http://github.com/deeplearning-wisc/ModeX

Sometimes the best signal is already hiding in the samples; you just need to find the mode. 🎯

7:49 AM · Jun 9, 2026 · 17 Views

/AI11h ago

Sharon Li introduces ModeX for evaluator-free Best-of-N selection, but Stella Biderman says the clustering method is standard textbook ML

The official implementation has been released on GitHub.

3555424.8K

#207

Original post

Sharon Li@SharonYixuanLi#604inAI

Introducing ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation, accepted at #ACL2026 Main! (led by Hyeong Kyu Choi @HyeonggyuC)

No reward models. No external evaluators. No auxiliary inference. Just the texts themselves.

We also provide theoretical justifications connecting our graph-based mode selection to kernel density estimation, grounding the approach with principled foundations.

📄 Paper: http://arxiv.org/abs/2601.02535 💻 Code: http://github.com/deeplearning-wisc/ModeX

Sometimes the best signal is already hiding in the samples; you just need to find the mode. 🎯

7:49 AM · Jun 9, 2026 · 17 Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS4.4KBOOKMARKS40LIKES51RETWEETS5REPLIES3

Sharon Li@SharonYixuanLi

Introducing our #ACL2026 paper ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation! (led by Hyeong Kyu Choi @HyeonggyuC)

No reward models. No external evaluators. No auxiliary inference. Just the texts themselves.

We also provide theoretical justifications connecting our graph-based mode selection to kernel density estimation, grounding the approach with principled foundations.

📄 Paper: http://arxiv.org/abs/2601.02535 💻 Code: http://github.com/deeplearning-wisc/ModeX

Sometimes the best signal is already hiding in the samples; you just need to find the mode. 🎯

11h4.4K5140

Stella Biderman@BlancheMinerva

@SharonYixuanLi This methodology can be found in virtually every ML textbook in the world and is already in widespread use.

Sharon Li@SharonYixuanLi

Introducing our #ACL2026 paper ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation! (led by Hyeong Kyu Choi @HyeonggyuC)

No reward models. No external evaluators. No auxiliary inference. Just the texts themselves.

We also provide theoretical justifications connecting our graph-based mode selection to kernel density estimation, grounding the approach with principled foundations.

📄 Paper: http://arxiv.org/abs/2601.02535 💻 Code: http://github.com/deeplearning-wisc/ModeX

Sometimes the best signal is already hiding in the samples; you just need to find the mode. 🎯

4h39732

Andreas Kirsch 🇺🇦@BlackHC

Surely you would want to cite works on semantic uncertainty:

https://arxiv.org/abs/2302.09664

or the nature version:

https://www.nature.com/articles/s41586-024-07421-0

or other works by Yarin Gal, e.g.:

https://proceedings.neurips.cc/paper_files/paper/2024/hash/10c456d2160517581a234dfde15a7505-Abstract-Conference.html

A bit surprised that the related work section is so short on semantic approaches for clustering

10h7921