Link to the paper: https://arxiv.org/abs/2605.10930 2/
There is growing interest in having AI systems work with humans rather than replacing them. The research questions are, to be honest, harder in the former case! One challenge is how do end users modulate their trust in the answers provided by LLMs? 1/ A new pre-print by @biswas_2707 and @PalodVardh12428 proposes a framework for evaluating trust--deserved and false--engendered by LLMs. Premise: LLMs will increasingly be used in cases where the end user doesn't have the capacity to verify the result. How do we empower users to develop appropriate trust in the answers? Traditionally, we trust the answers we can't verify based on prior guarantees--be they FAA certifications of planes or wisdom of crowds page rank certification of Google pages. These don't quite work for broad and shallow LLMs, which are (in)famous for their jagged intelligence--being correct on Math Olympiad problems one minute while failing on simple teasers that depend on unarticulated commonsense (viz. @conitzer's substack 😋). This paper first shows that most obvious ideas of augmenting the LLM answers with additional information--including (1) using "thinking traces" of LLMs (2) summaries of thinking traces (3) post-facto explanations--all significantly increase the false trust in end users. It then shows that the idea of differential explanations--asking LLMs to provide explanations both supporting and opposing their answers--do a better job of modulating the end user trust. (This idea is not unlike having noisy reviewers of your #AI conference papers write both arguments in favor of, and in opposition to acceptance of your paper--with the AC then using that information to calibrate their final decision).
@rao2z Thanks for the shoutout :-)
There is growing interest in having AI systems work with humans rather than replacing them. The research questions are, to be honest, harder in the former case! One challenge is how do end users modulate their trust in the answers provided by LLMs? 1/ A new pre-print by @biswas_2707 and @PalodVardh12428 proposes a framework for evaluating trust--deserved and false--engendered by LLMs. Premise: LLMs will increasingly be used in cases where the end user doesn't have the capacity to verify the result. How do we empower users to develop appropriate trust in the answers? Traditionally, we trust the answers we can't verify based on prior guarantees--be they FAA certifications of planes or wisdom of crowds page rank certification of Google pages. These don't quite work for broad and shallow LLMs, which are (in)famous for their jagged intelligence--being correct on Math Olympiad problems one minute while failing on simple teasers that depend on unarticulated commonsense (viz. @conitzer's substack 😋). This paper first shows that most obvious ideas of augmenting the LLM answers with additional information--including (1) using "thinking traces" of LLMs (2) summaries of thinking traces (3) post-facto explanations--all significantly increase the false trust in end users. It then shows that the idea of differential explanations--asking LLMs to provide explanations both supporting and opposing their answers--do a better job of modulating the end user trust. (This idea is not unlike having noisy reviewers of your #AI conference papers write both arguments in favor of, and in opposition to acceptance of your paper--with the AC then using that information to calibrate their final decision).