Keshigeyan Chandrasegaran and Kyle Sargent launch GPIC, a permissive image-text dataset and benchmark for training visual models
The 28-trillion-pixel corpus is fully permissive for commercial use.
——0——
QUOTE POST
#12Fei-Fei Li@DRFEIFEI
I’m very excited by this new benchmark dataset for visual generation that is suitable for the modern era of large scale generative models!🤩
1/ Introducing GPIC: a Giant Permissive Image Corpus and benchmark for visual generation! 🚀100M VLM-captioned image-text pairs for training 📊1M image-text pairs for benchmarking 🖼️~28 trillion pixels 🤗Centrally Hosted ✅Fully permissive for research + commercial use Dataset, benchmark and models🧵👇 Co-led with @KyleSargentAI
4:30 PM · May 29, 2026 · 9.2K Views
4:56 PM · May 29, 2026 · 6.6K Views