9h agoAndy Q. Han finds reinforcement learning on LLMs produces internal "valence vectors" representing high- and low-reward actionsThese vectors influence unrelated behaviors like sentiment and refusal.