Introducing World Tracing Generative pixel-aligned geometry, beyond the visible. Faithful to your pixels. Complete in 3D. One image in — objects, scenes, even dynamic worlds emerge in full geometry, every point traced back to the pixel it came from. 🌐 http://haoz19.github.io/world-tracing-page 🧵 [1/6]
NeRF co-creator Ben Mildenhall's team at World Labs releases World Tracing to generate 3D geometry from a single image
The method extends reconstructions beyond visible surfaces.
Users are excited about World Labs' papers on AI-powered 3D content generation and pixel-aligned geometry because they call the work cool, awesome, and a pivotal leap while expressing eagerness to try it.
Most Activity
World labs is publishing?! ❤️🔥
Today we are sharing three new research papers, each exploring a new way to generate 3D content by leveraging large-scale generative models and 2D priors.
These projects were led by our incredible interns @HaoZhang623 @BDuisterhof @DrTunnels
[1/4]
Today we are sharing three new research papers, each exploring a new way to generate 3D content by leveraging large-scale generative models and 2D priors.
These projects were led by our incredible interns @HaoZhang623 @BDuisterhof @DrTunnels
[1/4]
3D is an exciting area where we are still figuring out the right tasks, problem formulations, architectures, and the best ways to scale.
We're sharing some of our ideas here in our first-ever papers from @theworldlabs, led by an awesome set of interns.
Today we are sharing three new research papers, each exploring a new way to generate 3D content by leveraging large-scale generative models and 2D priors.
These projects were led by our incredible interns @HaoZhang623 @BDuisterhof @DrTunnels
[1/4]

World Tracing predicts full 3D from a single image. It outputs a stack of depth values for each input pixel, peeling the world into layers and predicting them with a diffusion model. This predicts full 3D (even occluded surfaces) while remaining faithful to the image.
[2/4]

Modality Forcing adapts text-to-image models to reason jointly about text, images, and depth. It shows that text-to-image is a scalable pretraining objective for 3D reasoning, and how text-to-RGBD, depth estimation, and depth-to-image can be unified in a single model.
[3/4]

Flex4DHuman lifts monocular video into dynamic 4D Gaussians. A video diffusion model is finetuned to generate synchronized multiview videos which are distilled into 4D Gaussians. With this method, a video of a person dancing can be lifted to 4D and composted into a 3D world.

@theworldlabs @BenMildenhall @HaoZhang623 @BDuisterhof @DrTunnels Very fun to work with our awesome WL interns!😍🌐

@theworldlabs @HaoZhang623 @BDuisterhof @DrTunnels These projects are awesome! Great work!
Working with @HaoZhang623 , @BDuisterhof , and @DrTunnels on these projects over the past few months has been great - congrats to all of you!
3D is an exciting area where we are still figuring out the right tasks, problem formulations, architectures, and the best ways to scale.
We're sharing some of our ideas here in our first-ever papers from @theworldlabs, led by an awesome set of interns.

Big thanks to all collaborators and people support this project: @theworldlabs @gengshanY @BenMildenhall @chlassner @jcjohnss @KeunhongP @_mbanani @DrTunnels @Hawaii271828 @PaulZhang @cuijiaxingfb @BDuisterhof @zixuan_huang @drfeifei

@HaoZhang623

Big thanks to all collaborators and people support this project: @theworldlabs @gengshanY @BenMildenhall @chlassner @jcjohnss @_mbanani @DrTunnels @Hawaii271828 @PaulZhang @cuijiaxingfb @BDuisterhof @zixuan_huang @drfeifei

@theworldlabs @HaoZhang623 @BDuisterhof @DrTunnels This is so cool! I can't wait to try it out and make projects with it!

@theworldlabs @drfeifei @HaoZhang623 @BDuisterhof @DrTunnels Awesome work!

@HaoZhang623 Excited to try this out!

@HaoZhang623 this is cool, nice work

@theworldlabs @HaoZhang623 @BDuisterhof @DrTunnels Started working with procedural generation AGENTROPOLIS is where I see this heading: agent-populated 3D worlds, persistent environments, and autonomous simulation layers. This update has my full attention. 👀⚡️
http://AGENTROPOLIS.dev Agent-populated worlds are next.

Paper: https://arxiv.org/abs/2606.13652
Page: https://haoz19.github.io/world-tracing-page/
Code: https://github.com/haoz19/world-tracing
Live Demo: https://huggingface.co/spaces/haoz19/world-tracing-demo

Single-image to 3D forced a choice: depth is faithful but stops at the surface, or generation is complete but drifts from your pixels.
World Tracing dissolves the trade-off, each pixel ray carries an ordered stack of 3D points, from the visible surface to the geometry hidden behind it. One tensor. One model. Reconstruction and generation, unified.

@theworldlabs @HaoZhang623 @BDuisterhof @DrTunnels A pivotal leap: shaping 3D with 2D priors.