9h agoDynaFLIP fuses vision, language, and 3D motion to outperform DINOv2 and SigLIP on robot policiesThe framework was trained on 260K video trajectories.