/Tech6h ago

ModeX Selects Best LLM Outputs Via Semantic Clustering Without Evaluators

2415283K

Original post

Sharon Li@SharonYixuanLi#1391inTech

Best-of-N sampling is often used to boost LLM performance, but the selection relies on external evaluators, adding cost and bias. What if you could select the best output without any external scoring at all?

Introducing ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation, accepted at #ACL2026 Main! (led by Hyeong Kyu Choi @HyeonggyuC)

💡 Our key insight: among multiple LLM generations, high-quality outputs tend to cluster together semantically. The best answer is the modal one: the generation that captures the dominant consensus.

How ModeX works: 1⃣ Build a similarity graph over N candidate generations 2⃣ Recursively apply spectral clustering via the Fiedler vector to isolate the dominant semantic cluster 3⃣ Select the centroid of that cluster as the final output

No reward models. No external evaluators. No auxiliary inference. Just the texts themselves.

📊 Results across text summarization (CNN/DailyMail), code generation (HumanEval), and math reasoning (Math-500) show ModeX consistently outperforms single-path and multi-path baselines, achieving state-of-the-art among evaluator-free methods.

We also provide theoretical justifications connecting our graph-based mode selection to kernel density estimation, grounding the approach with principled foundations.

📄 Paper: http://arxiv.org/abs/2601.02535 💻 Code: http://github.com/deeplearning-wisc/ModeX

Sometimes the best signal is already hiding in the samples; you just need to find the mode. 🎯

7:49 AM · Jun 9, 2026 · 17 Views

/Tech6h ago

ModeX Selects Best LLM Outputs Via Semantic Clustering Without Evaluators

2415283K

#1391

Original post

Sharon Li@SharonYixuanLi#1391inTech

Introducing ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation, accepted at #ACL2026 Main! (led by Hyeong Kyu Choi @HyeonggyuC)

No reward models. No external evaluators. No auxiliary inference. Just the texts themselves.

We also provide theoretical justifications connecting our graph-based mode selection to kernel density estimation, grounding the approach with principled foundations.

📄 Paper: http://arxiv.org/abs/2601.02535 💻 Code: http://github.com/deeplearning-wisc/ModeX

Sometimes the best signal is already hiding in the samples; you just need to find the mode. 🎯

7:49 AM · Jun 9, 2026 · 17 Views

Sentiment

Users criticized the ModeX paper on selecting LLM outputs via semantic clustering for its short related work section and missing citations to prior semantic uncertainty research.

Pos

0.0%

Neg

100.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS2.9KBOOKMARKS28LIKES40RETWEETS5REPLIES2

Sharon Li@SharonYixuanLi

Introducing our #ACL2026 paper ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation! (led by Hyeong Kyu Choi @HyeonggyuC)

No reward models. No external evaluators. No auxiliary inference. Just the texts themselves.

We also provide theoretical justifications connecting our graph-based mode selection to kernel density estimation, grounding the approach with principled foundations.

📄 Paper: http://arxiv.org/abs/2601.02535 💻 Code: http://github.com/deeplearning-wisc/ModeX

Sometimes the best signal is already hiding in the samples; you just need to find the mode. 🎯

6h2.9K4028

Andreas Kirsch 🇺🇦@BlackHC

Surely you would want to cite works on semantic uncertainty:

https://arxiv.org/abs/2302.09664

or the nature version:

https://www.nature.com/articles/s41586-024-07421-0

or other works by Yarin Gal, e.g.:

https://proceedings.neurips.cc/paper_files/paper/2024/hash/10c456d2160517581a234dfde15a7505-Abstract-Conference.html

A bit surprised that the related work section is so short on semantic approaches for clustering

6h7921