/AI1d ago

KE:SAI Releases 123D, Largest Unified Open Dataset for Autonomous Driving

11226801

#1577

Original post

Andrei Bursuc @CVPR#1577

Bernhard Jaeger@bern_jaeger

🔬 This weeks research highlight is 123D, KE:SAI's effort to unify all open driving data, creating the largest and most diverse pool of autonomous driving data out there.

5:02 AM · Jun 9, 2026 · 794 Views

/AI1d ago

KE:SAI Releases 123D, Largest Unified Open Dataset for Autonomous Driving

11226801

#1577

Original post

Andrei Bursuc @CVPR#1577

Bernhard Jaeger@bern_jaeger

🔬 This weeks research highlight is 123D, KE:SAI's effort to unify all open driving data, creating the largest and most diverse pool of autonomous driving data out there.

5:02 AM · Jun 9, 2026 · 794 Views

Sentiment

Users are excited about the 123D platform unifying open driving datasets for KE:SAI because it enables training large-scale open foundation models without relying on proprietary sources.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS91BOOKMARKS1LIKES1

Bernhard Jaeger@bern_jaeger

📜 "123D: Unifying Multi-Modal Autonomous Driving Data at Scale"

https://arxiv.org/abs/2605.08084

1d9111

REPLIES1

Bernhard Jaeger@bern_jaeger

I am particularly excited about 123D for KE:SAI because it enables us to train large-scale open foundation models without having to rely on proprietary data, making our work easily reproducible and easier to share.

1d11

Bernhard Jaeger@bern_jaeger

Many big advancements in AI in recent years were preceded by a consolidation effort around data that enabled them.

1d64

Bernhard Jaeger@bern_jaeger

You can find the open-source code on GitHub: https://github.com/kesai-labs/py123d

1d28

Bernhard Jaeger@bern_jaeger

🌍 To solve this, KE:SAI has developed 123D, an open-source framework that unifies multimodal driving data through a single API.

Today, 123D already unifies 3300 hours of data spanning 90000 km of real-world driving from nuScenes, Waymo, Argoverse, and many others.

1d17

Bernhard Jaeger@bern_jaeger

To name a few, Common Crawl enabled the training of large language models, LAION enabled the training of diffusion models for image generation, and Open X-Embodiment enabled robotics foundation models.

1d14

Bernhard Jaeger@bern_jaeger

Now with 123D, you can release your data in a unified format that is compatible with all the existing datasets and directly benefit from new research breakthroughs.

We hope 123D will encourage more companies to contribute data to the open data ecosystem in the coming years.

1d12

Bernhard Jaeger@bern_jaeger

Each dataset adopts different modalities: different cameras, lidars, ego states, annotations, HD maps, each with different rates and synchronization scheme.

1d11

Bernhard Jaeger@bern_jaeger

It enables easily studying areas such as viewpoint robust 3D object detection or testing the generalization capabilities of reinforcement learning agents.

1d9

Bernhard Jaeger@bern_jaeger

The 123D paper provides some baselines for these tasks, but there is still a lot of room for new methods to improve performance.

We hope the community leverages the 123D data to solve some of the important open generalization problems in autonomous driving.

1d9

Bernhard Jaeger@bern_jaeger

🚗 Autonomous driving has yet to see this type of consolidation.

Despite there being many different datasets available online, it is very hard to use them jointly.

1d9

Bernhard Jaeger@bern_jaeger

🏭 If you are a company that wants to spend the effort and time to open-source data, this used to present a significant risk.

Since your data will be incompatible with all the existing research dataset formats, your dataset might simply not get adopted by the research community.

1d7

Bernhard Jaeger@bern_jaeger

If you try making your code compatible with all the different coordinate system conventions out there, you will quickly throw your PC out of the window.

1d7

Bernhard Jaeger@bern_jaeger

123D is a collaboration between many institutions and people, in particular:

@DanielDauner, Valentin Charraut, @BastianBerle , Tianyu Li, Long Nguyen, Jiabao Wang, Changhui Jing, @MaxiIgl, Holger Caesar, @iamborisi , @yiyi_liao_, Andreas Geiger, and Kashyap Chitta.

1d24