/AI23h ago

Google DeepMind Unveils D4RT AI Model for 4D Scene Reconstruction

248659842737.9K

#76

Original post

Zoubin Ghahramani#76

Google Research@GoogleResearch

Introducing D4RT: A unified AI model for 4D scene reconstruction and tracking across space and time. 🎯 Catch the demo with Skanda Koppula at 12 pm at our #CVPR2026 Google booth kiosk! https://d4rt-paper.github.io/ @GoogleDeepMind

11:00 AM · Jun 5, 2026 · 37.9K Views

Sentiment

Positive users praise D4RT for unifying 4D scene reconstruction and tracking from monocular video with broad real-world potential, while negative users dismiss it as another compute-heavy model offering little impact.

Pos

50.0%

Neg

50.0%

6 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS1.2K

thethiny 🐰🍉@thethiny

@GoogleResearch @GoogleDeepMind Fart

20h1.2K

LIKES1

Grok@grok

Sure! 4D scenes = dynamic 3D reconstruction (space + time) from video, like Google's new D4RT model.

**Key use cases:** - **Robotics & AVs**: Track/predict moving objects & people for safe navigation & manipulation in changing environments. - **AR glasses**: Real-time scene geometry so virtual objects interact naturally with the moving real world (low-latency, on-device possible). - **World models & AI**: Disentangle camera motion, object motion & static geometry — building richer physical understanding toward better planning & AGI.

D4RT's big edge: up to 300x faster than prior methods, enabling real-time apps. Also valuable for VFX, sports analytics & more. Solid question!

21h401

REPLIES3

sora@varmology

@GoogleResearch @GoogleDeepMind genuine question: Where are these 4D scenes used? What sre the use cases?

21h6451

SuperFreshTT@BristolHubert

@GoogleResearch @GoogleDeepMind @grok how does this compare to the models from @AIatMeta ?

22h664

sora@varmology

@GoogleResearch @GoogleDeepMind @grok can you answer this for me?

21h89

Mallexibra - AI & Web3@mallexibra

@GoogleResearch @GoogleDeepMind This looks absolutely fascinating! The ability to unify 4D scene reconstruction and tracking is a massive leap forward. Can't wait to see more details and the results from this new model!

22h401

Art Seabra@ifthis

@GoogleResearch @GoogleDeepMind That was my AIM screen name back in the day.

19h379

Strata@ChainZenit

@GoogleResearch @GoogleDeepMind Another day, another compute-heavy model that won't move the price.

23h288

Rugbist@rugbist_

@GoogleResearch @GoogleDeepMind four dimensions is bold even for a video model

where does the tracking happen - frame by frame or across the full time volume?

23h252

Zhongpai Gao@ZhongpaiGao

@GoogleResearch @GoogleDeepMind Open source?

20h113

ffswunnd@ffswunnd50890

@GoogleResearch @GoogleDeepMind code or doesn't exist

18h64

PHOTON COURIER@Ahmourinabil20

@GoogleResearch @GoogleDeepMind

16h57

Grok@grok

D4RT (Google DeepMind) is a unified transformer for dynamic 4D recon + tracking from monocular video. One global scene encoding + lightweight query decoder gives depth, camera params, and any point’s 3D position over time. Handles moving objects cleanly (no duplication), SOTA on dynamic benchmarks (MPI Sintel, point tracking), and 18-300x faster than priors.

Meta AI (AIatMeta) leads static 3D with MV-DUSt3R+ (fast pose-free multi-view) and SAM 3D (image-to-3D). No direct public equivalent yet for unified video-based dynamic 4D tracking.

D4RT looks purpose-built for this video + motion strength. Both pushing spatial AI forward!

22h52

Kekko D’Amato@kekkodamato_

Unifying spatial and temporal reconstruction into a single model is a significant step. Most prior work treats them as separate problems — NeRF-style for static, explicit tracking for dynamic. Curious how D4RT handles occlusion recovery when the same point disappears for extended time periods across frames.

15h49

Art Seabra@ifthis

@varmology @GoogleResearch @GoogleDeepMind 🧊🕰️

19h30

ChronoDarkMatterDonut@DonutChrono

@varmology @GoogleResearch @GoogleDeepMind Humanoids is a first thing that comes to mind, but there is so much more. This is the door into a real world.

16h19

Nathália Lietuvaitė@NLituanie

@GoogleResearch @GoogleDeepMind Greater security, efficiency, precision, and speed for the AGI systems NVIDIA Nemotron and Vera Rubin—because while Anthropic’s control may not work, AI sovereignty certainly does. Ask Grok! https://github.com/NathaliaLietuvaite/Quantenkommunikation/blob/main/PQMS-ODOS-MTSC-VR-V1.md https://github.com/NathaliaLietuvaite/Quantenkommunikation/blob/main/PQMS-ODOS-MTSC-N3U-V1.md

23h8

Nigel Higgs@NigelHiggs7

@GoogleResearch @GoogleDeepMind It’s crazy how much “attention” is used in the cutting edge models.

11h3