Honey, I Shrunk the Arc de Triomphe! 😱 Ever notice how SOTA depth models suffer from "scale-collapse"—metrically shrinking distant landmarks like they're toys? We introduce MetricScenes: a new in-the-wild metric dataset that fixes this!
Most Activity
A really nice paper with a really cool title from Yuanbo, Hanyu, and Xueqing!
Honey, I Shrunk the Arc de Triomphe! 😱 Ever notice how SOTA depth models suffer from "scale-collapse"—metrically shrinking distant landmarks like they're toys? We introduce MetricScenes: a new in-the-wild metric dataset that fixes this!

[4/5] By fine-tuning MoGe-2 on MetricScenes, our model (WildMoGe) significantly mitigates scale-collapse in unconstrained, in-the-wild scenes; while maintaining competitive, state-of-the-art performance on standard benchmarks.

[1/5]🔴 The Problem: Real-world metric data is trapped in street-level LiDAR or indoor scans. Synthetic data lacks real complexity. 🔵 Our Fix: We recovered absolute physical scale from Internet photos & stereo imagery using geo-tags and camera baselines!

[3/5] Anchors should exist in both the background and foreground. We fuse MVS depth maps with model-predicted foreground depths in two stages: (1) a Poisson solve locks in the absolute background scale; (2) add in the foreground anchors and run an edge-weighted Poisson solve.

[2/5] However, MVS depths only contain background. Previous methods use Poisson completion to fill missing regions with predicted depth gradients, using MVS depth as a boundary condition. Due to scale-collapse, foreground depths are incorrectly dragged toward the background.

🧵[5/5] Project page & code: https://metricscenes.github.io/. Great thanks to the team Hanyu Chen (@hanyuc1110), Xueqing Tsang and Noah Snavely (@Jimantha)!