9h agoPavel Izmailov and David Chalmers find reinforcement learning recruits a pre-existing "functional welfare" axis in language modelsThese internal activation vectors steer model confidence, sentiment, and refusal.SentimentSentimentPos100%Neg0%Users appreciate the discovery of a functional welfare axis in RL-trained LLMs because they find the results interesting from multiple perspectives and value the compute support plus collaboration behind the work.4 comments with sentiment. View comments.