2d ago

Gemini 3.1 Enables Real-Time Semantic Annotation of 3D Gaussian Splats

0
Original post

Semantically annotating 3D gaussian splats on the fly using gemini 3.1 + sparkjs 1. Load any 3D scene and hit scan 2. Get 2D detections from VLM 3. Cluster outputs & project into 3D world space 4. Save as a persistent 3D semantic layer Inspired by @alexanderchen's experiments with gemini visual intelligence. Just had to try to lift it from 2D to 3D!

7:15 PM · May 14, 2026 View on X

Semantically annotating 3D gaussian splats on the fly using gemini 3.1 + sparkjs

1. Load any 3D scene and hit scan 2. Get 2D detections from VLM 3. Cluster outputs & project into 3D world space 4. Save as a persistent 3D semantic layer

Inspired by @alexanderchen's experiments with gemini visual intelligence. Just had to try to lift it from 2D to 3D!

2:15 AM · May 15, 2026 · 42.6K Views

Those asking how to do this, here's the recipe: TL;DR wiggle the camera around while taking screen shots and asking Gemini for screen space annotations and then clustering them so they are pinned to the right spot in 3d space. You can make it even more precise to project into world space by consider SAM3 labels to reason about containment while still using the VLM for the richer label / description.

Bilawal SidhuBilawal Sidhu@bilawalsidhu

Semantically annotating 3D gaussian splats on the fly using gemini 3.1 + sparkjs 1. Load any 3D scene and hit scan 2. Get 2D detections from VLM 3. Cluster outputs & project into 3D world space 4. Save as a persistent 3D semantic layer Inspired by @alexanderchen's experiments with gemini visual intelligence. Just had to try to lift it from 2D to 3D!

2:15 AM · May 15, 2026 · 42.6K Views
7:04 PM · May 15, 2026 · 1.4K Views
Gemini 3.1 Enables Real-Time Semantic Annotation of 3D Gaussian Splats · Digg