6h ago

Jasper AI releases MONET, an Apache 2.0 dataset of 105 million image-text pairs, alongside Nano T2I training codebase

The open dataset is hosted on Hugging Face.

0
Original post

📢 New @heyjasper release ! 📢 MONET 🌸 : An Apache2.0 deduped and recaptioned dataset of 105M samples unlocking reproducible text-to-image research. Nano T2I 🖌️ : A codebase to train your own T2I model 🤗 @huggingface: https://huggingface.co/datasets/jasperai/monet 💻: https://github.com/gojasper/nano-t2i Very excited about this new release, pushing the boundaries of open and reproducible T2I research. Congrats to the team! Benjamin Aubin Gonzalo Quintana @onurxtasar @UlaLaParis @_jeev2 @dh7net @clipdropapp @heyjasperai

6:01 AM · May 28, 2026 View on X
Reposted by

With 104M of image-text pairs, this is one of the largest, if not the largest, openly-licensed image dataset

And it's on @huggingface!!

Kudos @heyjasperai

Clément ChadebecClément Chadebec@CChadebec

📢 New @heyjasper release ! 📢 MONET 🌸 : An Apache2.0 deduped and recaptioned dataset of 105M samples unlocking reproducible text-to-image research. Nano T2I 🖌️ : A codebase to train your own T2I model 🤗 @huggingface: https://huggingface.co/datasets/jasperai/monet 💻: https://github.com/gojasper/nano-t2i Very excited about this new release, pushing the boundaries of open and reproducible T2I research. Congrats to the team! Benjamin Aubin Gonzalo Quintana @onurxtasar @UlaLaParis @_jeev2 @dh7net @clipdropapp @heyjasperai

1:01 PM · May 28, 2026 · 19.7K Views
1:47 PM · May 28, 2026 · 10.9K Views