We're introducing a mechanism by which to enforce some control in a LLMs reasoning process. We developed Behavior Cues to steer models, avoiding both overthinking and speculative collapse. @ccui9 provides a great overview of the work in this thread 👇
This has been sitting on arxiv for a bit, but figured it's time to announce it properly. Introducing Behavior Cues: a way to make LLM reasoning more monitorable and controllable for scalable oversight.