Coarse to fine is one of the most important technique to get you CE win when training visual generation. Congratulations to my lab who just brought this to video generation!
Introducing MilliVid, our new method for long-context video generation! MilliVid creates videos that are consistent over long time spans, without using retrieval heuristics or 3D maps! (1/n) https://davidcharatan.com/millivid/#