xAI's Ethan He applies Monte Carlo Tree Search at inference to prevent semantic drift in long video generation
Test-time look-ahead rollouts increase computational cost during inference.
@EthanHe_42 That’s an interesting idea
We applied AlphaGo's algorithm to video generation. Long video generation often breaks after a few extensions. We use MCTS to evaluate multiple continuations with look-ahead rollouts and backpropagated rewards. It produces long video while maintaining comparable visual fidelity. The honest caveat is increased compute cost which I think might be acceptable once video model capability exceeds certain usability threshold. paper: https://openreview.net/forum?id=ilir6A52vh