/AI18h ago

More than 25 open-weight models launch in a single week, led by NVIDIA's Nemotron 3 Ultra and Google's Gemma 4

Ideogram also shipped its first open-weight image model.

701.3K2071.3K197.2K
Original postAK#29
Victor M@victormustar

Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality:

🧠 LLMs

→ NVIDIA Nemotron 3 Ultra: 550B hybrid Mamba-MoE, only 55B active, 1M context, MMLU 89.1. NVFP4 variant claims ~5x throughput on Blackwell. First openly-weighted 550B hybrid Mamba-Transformer, closing the gap with frontier closed models.

→ Google Gemma 4 12B: fully open dense any-to-any (text/image/audio/video), 256k context, encoder-free, 140+ languages, AIME 2026 at 77.5. Shipped with a 23-checkpoint QAT wave (mobile ONNX + MLX). Most deployable model of the week.

→ StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0.

→ Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week.

→ JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0.

🎨 Image gen (the surprise of the week)

→ Ideogram 4: their FIRST-EVER open weights. 9.3B flow-matching DiT trained from scratch. #2 overall behind GPT Image 2, top open-weight model on Design Arena + LMArena. Strongest open checkpoint for text-rich images, full stop. It has taste. Still can't believe this is open weights.

🔊 Audio & Speech (a breakout week for open TTS, 4 labs shipped)

→ Boson Higgs Audio v3 4B: 102 languages, 21 emotions, singing/whispering/shouting, sub-second TTFA. → RedNote dots.tts: the only fully continuous (no codec) open TTS pipeline, Apache 2.0. → Google Magenta RealTime 2: real-time music gen, <200ms latency, text+audio+MIDI. multimodalart ported it to PyTorch within hours with live ZeroGPU demos. → NVIDIA Nemotron-3.5 ASR: 600M streaming, 17x more concurrent streams vs Parakeet RNNT 1.1B.

👁️ Vision & VLMs

→ PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0.

🎬 Video, 3D & World Models

→ NVIDIA Cosmos3-Super: 64B omnimodal world model coupling action trajectories with video+audio gen, for Physical AI. → JD JoyAI-Echo: up to 5-min multi-shot text-to-video on LTX-2.3. → ByteDance Bernini-R + VAST TripoSplat (single-image-to-3D Gaussian splats, MIT).

2:59 PM · Jun 5, 2026 · 196.8K Views
Sentiment

Users are excited about the record week of open-weight AI model releases across modalities because of the impressive volume, quality, pace, and new consumer-accessible capabilities like large context windows.

Pos
97.0%
Neg
3.0%
18 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS2.9KBOOKMARKS13LIKES27REPLIES3
Gavin Baker@GavinSBaker

Quite a week for open-source AI. Especially American open-source. Nemotron 3 Ultra is the most important release in quite some time. And some really cool RL and fine-tuning work from Harvey.

Victor M@victormustar

Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality:

🧠 LLMs

→ NVIDIA Nemotron 3 Ultra: 550B hybrid Mamba-MoE, only 55B active, 1M context, MMLU 89.1. NVFP4 variant claims ~5x throughput on Blackwell. First openly-weighted 550B hybrid Mamba-Transformer, closing the gap with frontier closed models.

→ Google Gemma 4 12B: fully open dense any-to-any (text/image/audio/video), 256k context, encoder-free, 140+ languages, AIME 2026 at 77.5. Shipped with a 23-checkpoint QAT wave (mobile ONNX + MLX). Most deployable model of the week.

→ StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0.

→ Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week.

→ JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0.

🎨 Image gen (the surprise of the week)

→ Ideogram 4: their FIRST-EVER open weights. 9.3B flow-matching DiT trained from scratch. #2 overall behind GPT Image 2, top open-weight model on Design Arena + LMArena. Strongest open checkpoint for text-rich images, full stop. It has taste. Still can't believe this is open weights.

🔊 Audio & Speech (a breakout week for open TTS, 4 labs shipped)

→ Boson Higgs Audio v3 4B: 102 languages, 21 emotions, singing/whispering/shouting, sub-second TTFA. → RedNote dots.tts: the only fully continuous (no codec) open TTS pipeline, Apache 2.0. → Google Magenta RealTime 2: real-time music gen, <200ms latency, text+audio+MIDI. multimodalart ported it to PyTorch within hours with live ZeroGPU demos. → NVIDIA Nemotron-3.5 ASR: 600M streaming, 17x more concurrent streams vs Parakeet RNNT 1.1B.

👁️ Vision & VLMs

→ PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0.

🎬 Video, 3D & World Models

→ NVIDIA Cosmos3-Super: 64B omnimodal world model coupling action trajectories with video+audio gen, for Physical AI. → JD JoyAI-Echo: up to 5-min multi-shot text-to-video on LTX-2.3. → ByteDance Bernini-R + VAST TripoSplat (single-image-to-3D Gaussian splats, MIT).

17mViews 2.9KLikes 27Bookmarks 13
RETWEETS187
Victor M@victormustar

Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality:

🧠 LLMs

→ NVIDIA Nemotron 3 Ultra: 550B hybrid Mamba-MoE, only 55B active, 1M context, MMLU 89.1. NVFP4 variant claims ~5x throughput on Blackwell. First openly-weighted 550B hybrid Mamba-Transformer, closing the gap with frontier closed models.

→ Google Gemma 4 12B: fully open dense any-to-any (text/image/audio/video), 256k context, encoder-free, 140+ languages, AIME 2026 at 77.5. Shipped with a 23-checkpoint QAT wave (mobile ONNX + MLX). Most deployable model of the week.

→ StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0.

→ Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week.

→ JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0.

🎨 Image gen (the surprise of the week)

→ Ideogram 4: their FIRST-EVER open weights. 9.3B flow-matching DiT trained from scratch. #2 overall behind GPT Image 2, top open-weight model on Design Arena + LMArena. Strongest open checkpoint for text-rich images, full stop. It has taste. Still can't believe this is open weights.

🔊 Audio & Speech (a breakout week for open TTS, 4 labs shipped)

→ Boson Higgs Audio v3 4B: 102 languages, 21 emotions, singing/whispering/shouting, sub-second TTFA. → RedNote dots.tts: the only fully continuous (no codec) open TTS pipeline, Apache 2.0. → Google Magenta RealTime 2: real-time music gen, <200ms latency, text+audio+MIDI. multimodalart ported it to PyTorch within hours with live ZeroGPU demos. → NVIDIA Nemotron-3.5 ASR: 600M streaming, 17x more concurrent streams vs Parakeet RNNT 1.1B.

👁️ Vision & VLMs

→ PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0.

🎬 Video, 3D & World Models

→ NVIDIA Cosmos3-Super: 64B omnimodal world model coupling action trajectories with video+audio gen, for Physical AI. → JD JoyAI-Echo: up to 5-min multi-shot text-to-video on LTX-2.3. → ByteDance Bernini-R + VAST TripoSplat (single-image-to-3D Gaussian splats, MIT).

18hViews 196.8KLikes 1.3KBookmarks 1.3K
Victor M@victormustar

Let’s not forget:

https://huggingface.co/Hcompany/Holo-3.1-4B

And from yesterday

https://huggingface.co/collections/google/gemma-4-qat-q4-0

8hViews 836Likes 6Bookmarks 7
Levi Fawcett@Levi_Fawcett

@victormustar We can't manually evaluate every model at this point.. is there someone who evals/benchmarks each model, with examples? Would follow

5hViews 389Bookmarks 1
Milan@influenist

@victormustar The model race is becoming a commodity race.

The next winners won't be determined by model names, but by products, agents and distribution.

8hViews 312Likes 1Bookmarks 1

@victormustar It's easy to underestimate the importance of the smaller task specific models

8hViews 222Likes 1
Victor M@victormustar

@Levi_Fawcett your agents will 🚀 against your real project

5hViews 238Likes 2
Maziyar PANAHI@MaziyarPanahi

@victormustar damn! 🔥

5hViews 248Likes 1
Jeroen Dee@jeroendee

@victormustar @huggingface Cc @jgordijn LLM releases kwantiteit

10hViews 141Likes 1
CJ@cjtrade01

@victormustar People sleeping on Qwen 3.7. Best OS LLM.

5hViews 123Likes 1

@victormustar Who expected this many drops within a single week?

3hViews 56Likes 1
Virgil Maro@_virgil19

@victormustar 25 frontier drops in one week means the weights are the commodity now. whatever's scarce moved up a layer, to the harness and evals that make one actually usable

15hViews 53Likes 1
Kekko D’Amato@kekkodamato_

@victormustar The gap between open and closed is collapsing faster than most timelines assumed. 550B hybrid MoE with 1M context dropping open-weight changes the calculus around proprietary model moats significantly. What used to be a 12-month lag is now weeks.

12hViews 167

@victormustar Didn't Chandra 2 drop this week as well for ocr?

12hViews 163
Sayooj@sayoojkeloth

@victormustar Insane week

12hViews 150
Mary Newhauser@m_newhaus

@victormustar @NielsRogge Literally the worst week to be on vacation. Maximum fomo 🥲

7hViews 143
Alex@Myrlikte

@victormustar idiogram lol

13hViews 140
CypherDom@brunobevale

@victormustar Thank you so much for bringing not only LLM but vision, TTS and other launches. This is important

2hViews 133

@victormustar 55b active still hurts serving when expert routing jitter blows up tail latency

8hViews 130
Load more posts