also bullish for logprob-aware decay?
my current pet idea is something like ECHO but with self-generated hint as a logprob filter on env tokens (a la OPSD) to mitigate overtraining, and a core RL objective with some kind of pressure for hints to target useful world models
occams razor answer to "what's going on here" is grokking imo
bullish for cartridges

