Current robot policies overfit specific language templates, handling 'pick and place' but freezing on 'drag it to me ' or 'push it closer to me.' They also lack control over execution: which hand, what approach angle, where to grasp, which path to follow.
🤖 FineVLA make robots steerable : changing instruction alters execution; same task, different phrasing, distinct actions — all faithfully done.
🏠 Homepage: https://finevla.xlang.ai 📄 Paper: https://huggingface.co/papers/2605.27284 💻Codebase: https://github.com/xlang-ai/FineVLA
🧵[1/6]
