/Tech7h ago

RAGEN-2 Paper Proposes MI Metrics to Detect Template Collapse in RL

10161141408.2K

Original post unavailable.

/Tech7h ago

RAGEN-2 Paper Proposes MI Metrics to Detect Template Collapse in RL

10161141408.2K

Original post unavailable.

Sentiment

Users confirm the RAGEN-2 paper's template collapse observations match their production experiences where models gave near-identical answers despite fine diversity metrics.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS204LIKES1

Cameron R. Wolfe, Ph.D.@cwolferesearch

sorry! link to paper is here: https://arxiv.org/abs/2604.06268

4h2041

BOOKMARKS1

Harris7@KrishWiller

@cwolferesearch https://arxiv.org/abs/2604.06268

5h441

Blissy@BlissyOnX

@cwolferesearch token level is cool but what about agent rollup entropy for full multi-turn episodes

feels like that matters more in their setting

7h79

Guilherme O'Tina@guilhermeotina

the snr filtering via reward variance is neat, but there's a bootstrapping tension: if the model is already collapsed, reward variance is low, so the filter avoids the prompts that could break the collapse. feels like you'd need deliberate high-variance injections as a reset mechanism

6h69

Strata@ChainZenit

@cwolferesearch Standard RL metrics usually fail to capture the full picture anyway.

7h60

Rugbist@rugbist_

@cwolferesearch token-level entropy only captures breadth inside one response

mutual info across responses seems like the real tell

7h52

Shuying Luo@shuying_luo

@cwolferesearch It would be interesting to see cross turn measures as well. In long term rollouts, later reasoning is probably less related to the initial prompt than the state

5h34

TecAce@tecaceai

@cwolferesearch Template collapse matches what we hit in production — per-response diversity looked fine while the model gave near-identical answers across very different inputs. Mutual info across responses is the signal we proxied by hand. Does it hold online, not just offline eval?

5h2