/Tech5h ago

Paper Proposes Personalized Benchmarking To Evaluate LLMs By User Preferences

221531.3K
Original postChenhao Tan#604

When did you last check a leaderboard before picking an LLM? 🤔 Excited to share that our paper "Personalized Benchmarking: Evaluating LLMs by Individual Preferences" was accepted to ACL Findings 2026! 🎉 Joint work with Heran Wang and @ChenhaoTan

10:26 AM · Jun 10, 2026 · 1.3K Views
Sentiment

Users like the UChicago researchers' proposal for personalized LLM benchmarking because it represents the kind of insightful post they appreciate.

Pos
100.0%
Neg
0.0%
1 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS23REPLIES1

We found that for 57% of active Chatbot Arena users, individual rankings are statistically indistinguishable from a random ordering of models under Bradley-Terry. Users show substantial heterogeneity in topical interests and communication styles.

5hViews 23
LIKES1
Alexa Web3 (e/acc)@alexabelonix

@ggarbacea @ChenhaoTan this is the kind of post i like.

5hViews 13Likes 1

By modeling these features, we predict user-specific rankings and cut prediction error by up to 35% over aggregate baselines! 📉✨

Read the paper here: https://arxiv.org/abs/2604.18943

5hViews 17