/AI6h ago

XVR Paper Accepted To CVPR 2026 Boosts VLM Cross-View Reasoning

713321.3K

Original posts

Quote posts

#1899

Reposts

#1899

Original post

Kimin#1899

Suchae Jeong@Suchaeck

🚀 Our paper "Learning Multi-View Spatial Reasoning from Cross-View Relations (XVR)" has been accepted to #CVPR2026!

Current VLMs can reason from a single view surprisingly well, but they still struggle to connect information across multiple viewpoints.

To address this, we introduce XVR: • 100K-sample VQA dataset • 3 categories, 8 tasks • Designed specifically for cross-view spatial reasoning

Most excitingly, cross-view reasoning transfers to robot manipulation. Using an XVR-trained VLM as a VLA backbone improves RoboCasa manipulation success rates by +13%p on average.

Project page: https://cross-view-relations.github.io/ Paper: https://arxiv.org/abs/2603.27967

🍿 More details below

2:36 PM · Jun 2, 2026 · 806 Views

/AI6h ago

XVR Paper Accepted To CVPR 2026 Boosts VLM Cross-View Reasoning

--0--

Original posts

Quote posts

#1899

Reposts

#1899

Original post

Kimin#1899

Suchae Jeong@Suchaeck

🚀 Our paper "Learning Multi-View Spatial Reasoning from Cross-View Relations (XVR)" has been accepted to #CVPR2026!

Current VLMs can reason from a single view surprisingly well, but they still struggle to connect information across multiple viewpoints.

To address this, we introduce XVR: • 100K-sample VQA dataset • 3 categories, 8 tasks • Designed specifically for cross-view spatial reasoning

Most excitingly, cross-view reasoning transfers to robot manipulation. Using an XVR-trained VLM as a VLA backbone improves RoboCasa manipulation success rates by +13%p on average.

Project page: https://cross-view-relations.github.io/ Paper: https://arxiv.org/abs/2603.27967

🍿 More details below

2:36 PM · Jun 2, 2026 · 806 Views

Sentiment

Users are excited about the XVR dataset because it boosts VLMs' cross-view spatial reasoning to improve robot manipulation and comes with released code and pipelines to advance multi-view work.

Pos

100.0%

Neg

0.0%

2 comments with sentiment.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

VIEWS520BOOKMARKS2LIKES5

Kimin@kimin_le2

Introducing XVR, a new dataset for improving spatial reasoning across multiple views! We show that better spatial reasoning leads to stronger VLM backbones for VLAs. :)

If you’re interested, come chat with @Suchaeck and @Jay019374 at #CVPR2026

Suchae Jeong@Suchaeck

🚀 Our paper "Learning Multi-View Spatial Reasoning from Cross-View Relations (XVR)" has been accepted to #CVPR2026!

Current VLMs can reason from a single view surprisingly well, but they still struggle to connect information across multiple viewpoints.

To address this, we introduce XVR: • 100K-sample VQA dataset • 3 categories, 8 tasks • Designed specifically for cross-view spatial reasoning

Most excitingly, cross-view reasoning transfers to robot manipulation. Using an XVR-trained VLM as a VLA backbone improves RoboCasa manipulation success rates by +13%p on average.

Project page: https://cross-view-relations.github.io/ Paper: https://arxiv.org/abs/2603.27967

🍿 More details below

6h52052