10h ago

Researchers Simplify Multi-Turn RL Using One Rule And Chat Template

0
Original post

multi-turn RL and the "tito" problem keeps coming up. we've been working on it for a while, and the takeaway is that it's much easier than people are making it. it takes 1 implementation rule, and 1 chat-template property that all models already comply with. **that's all you need to do it right** https://qgallouedec-tito.hf.space

2:28 AM · May 28, 2026 View on X