It’s great to see the gap between reconstruction and generation narrowing so quickly this year
https://kasothaphie.github.io/GenRecon/
It produces relightable PBR meshes from sparse RGB images.
It’s great to see the gap between reconstruction and generation narrowing so quickly this year
https://kasothaphie.github.io/GenRecon/
Users are excited about GenRecon because it narrows the gap between reconstruction and generation by using generative priors to produce impressive high-fidelity details from sparse views.
No Digg Deeper questions have been answered for this story yet.
📢📢GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction📢📢
Reconstructing high-fidelity 3D scenes from sparse RGB input is hard. It needs a strong 3D prior!
We reformulate multi-view scene reconstruction as conditional 3D generation over overlapping spatial chunks, lifting posed image features into a generative shape prior via 3D conditioning. As an example prior, we build on Trellis2, and train it such that its reconstruction is pixel aligned and matches from all views.
GenRecon achieves unprecedented reconstruction quality from any sparse RGB input sequence, even from a phone capture. The reconstruction also includes PBR materials which facilitates relighting and virtual object insertion.
https://youtu.be/Tp-i06DPXa0 https://kasothaphie.github.io/GenRecon/
Amazing work by @katha_schmid, @nicolasvluetzow, Jozef, @angelaqdai

Here is another result on a short sequence of sparse RGB images. GenRecon operates on a chunk-by-chunk basis and the reconstruction accurately matches the constraints from the input images. At the same time, we obtain clean, high-quality 3D geometries with PBR materials.

congrats on the SIGGRAPH accept, that's a legit milestone for a first paper
the reconstruction/generation gap closing is the part i find most interesting from a practical side. Once generation can reliably produce geometry that's as queryable as reconstructed mesh, the downstream tooling gets a lot simpler

@MattNiessner @angelaqdai I am surprised to see that multi view reconstruction was achievable with just applying LoRA to Trellis; since the training space seems drastically different. Excited to try this out! Great work team!

@kishoreVen1729 @angelaqdai Thanks, yeah it works quite well although it needed a bit of training. Overall, it seems to retain the clean shape prior quite well.

@MattNiessner wow, that's an impressive amount of detail for that quick swipe

@MattNiessner Amazing stuff! Love how 3D generation is getting closer and closer to the real data and is filling all the missing data.

@MattNiessner Does this produce metric-accurate reconstructions, or at least preserve relative scale accurately enough to rescale later? If yes, this could be very useful for Real Estate Digital Twin applications.

@Luckyballa Went from 'can we NeRF this?' to 'just generate the missing bits.' Multi-view geometry hits a wall. GenRecon brute-forces it with a prior. I'm buying it. Probably the future.