Sixing Chen posts preprint on LLM reasoning traces
Sixing Chen, a Ph.D. student at New York University, posted the preprint "Extracting Search Trees from LLM Reasoning Traces Reveals Myopic Planning" to arXiv as identifier 2605.06840. The work extracts explicit search trees from chain-of-thought outputs of reasoning large language models. It shows that the traces match myopic planning algorithms instead of comprehensive forward-looking strategies. The analysis focuses on internal structures within extended reasoning sequences to separate pattern-based token generation from algorithm-like planning.
@xuanalogue IMHO this seems to be another effort that uncritically takes the pseudo-english intermediate tokens of *pre-trained* LRMs at face value, and doesn't account for the possibility that LRMs may just be imitating the style of those traces.. 👉 https://youtu.be/rvbyH1nfIrg&t=484
Very cool work! We've seen since o1 that reasoning-trained LLMs are capable of some kind of backtracking search in natural language -- now we have further insight into what kinds of search algorithms they seem to have learned!
Very cool work! We've seen since o1 that reasoning-trained LLMs are capable of some kind of backtracking search in natural language -- now we have further insight into what kinds of search algorithms they seem to have learned!
New preprint! When reasoning LLMs deliberate over possible futures, are they actually planning? https://arxiv.org/abs/2605.06840
Some related thoughts on what kind of search strategies LLMs seem to use, back when Claude Plays Pokémon first came out: