/AI18h ago

More than 25 open-weight models launch in a single week, led by NVIDIA's Nemotron 3 Ultra and Google's Gemma 4

Ideogram also shipped its first open-weight image model.

701.3K2071.3K197.2K

#29

Original post

AK#29

Victor M@victormustar

Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality:

🧠 LLMs

→ NVIDIA Nemotron 3 Ultra: 550B hybrid Mamba-MoE, only 55B active, 1M context, MMLU 89.1. NVFP4 variant claims ~5x throughput on Blackwell. First openly-weighted 550B hybrid Mamba-Transformer, closing the gap with frontier closed models.

→ Google Gemma 4 12B: fully open dense any-to-any (text/image/audio/video), 256k context, encoder-free, 140+ languages, AIME 2026 at 77.5. Shipped with a 23-checkpoint QAT wave (mobile ONNX + MLX). Most deployable model of the week.

→ StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0.

→ Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week.

→ JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0.

🎨 Image gen (the surprise of the week)

→ Ideogram 4: their FIRST-EVER open weights. 9.3B flow-matching DiT trained from scratch. #2 overall behind GPT Image 2, top open-weight model on Design Arena + LMArena. Strongest open checkpoint for text-rich images, full stop. It has taste. Still can't believe this is open weights.

🔊 Audio & Speech (a breakout week for open TTS, 4 labs shipped)

→ Boson Higgs Audio v3 4B: 102 languages, 21 emotions, singing/whispering/shouting, sub-second TTFA. → RedNote dots.tts: the only fully continuous (no codec) open TTS pipeline, Apache 2.0. → Google Magenta RealTime 2: real-time music gen, <200ms latency, text+audio+MIDI. multimodalart ported it to PyTorch within hours with live ZeroGPU demos. → NVIDIA Nemotron-3.5 ASR: 600M streaming, 17x more concurrent streams vs Parakeet RNNT 1.1B.

👁️ Vision & VLMs

→ PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0.

🎬 Video, 3D & World Models

→ NVIDIA Cosmos3-Super: 64B omnimodal world model coupling action trajectories with video+audio gen, for Physical AI. → JD JoyAI-Echo: up to 5-min multi-shot text-to-video on LTX-2.3. → ByteDance Bernini-R + VAST TripoSplat (single-image-to-3D Gaussian splats, MIT).

2:59 PM · Jun 5, 2026 · 196.8K Views

Sentiment

Users are excited about the record week of open-weight AI model releases across modalities because of the impressive volume, quality, pace, and new consumer-accessible capabilities like large context windows.

Pos

97.0%

Neg

3.0%

18 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS2.9KBOOKMARKS13LIKES27REPLIES3

Gavin Baker@GavinSBaker

Quite a week for open-source AI. Especially American open-source. Nemotron 3 Ultra is the most important release in quite some time. And some really cool RL and fine-tuning work from Harvey.

Victor M@victormustar

Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality:

🧠 LLMs

→ StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0.

→ Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week.

→ JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0.

🎨 Image gen (the surprise of the week)

🔊 Audio & Speech (a breakout week for open TTS, 4 labs shipped)

👁️ Vision & VLMs

→ PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0.

🎬 Video, 3D & World Models

17m2.9K2713

RETWEETS187

Victor M@victormustar

Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality:

🧠 LLMs

→ StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0.

→ Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week.

→ JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0.

🎨 Image gen (the surprise of the week)

🔊 Audio & Speech (a breakout week for open TTS, 4 labs shipped)

👁️ Vision & VLMs

→ PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0.

🎬 Video, 3D & World Models

18h196.8K1.3K1.3K

Victor M@victormustar

Let’s not forget:

https://huggingface.co/Hcompany/Holo-3.1-4B

And from yesterday

https://huggingface.co/collections/google/gemma-4-qat-q4-0

8h83667

Levi Fawcett@Levi_Fawcett

@victormustar We can't manually evaluate every model at this point.. is there someone who evals/benchmarks each model, with examples? Would follow

5h3891

Milan@influenist

@victormustar The model race is becoming a commodity race.

The next winners won't be determined by model names, but by products, agents and distribution.

8h31211

Richard Palethorpe@jichiep

@victormustar It's easy to underestimate the importance of the smaller task specific models

8h2221

Victor M@victormustar

@Levi_Fawcett your agents will 🚀 against your real project

5h2382

Maziyar PANAHI@MaziyarPanahi

@victormustar damn! 🔥

5h2481

Jeroen Dee@jeroendee

@victormustar @huggingface Cc @jgordijn LLM releases kwantiteit

10h1411

CJ@cjtrade01

@victormustar People sleeping on Qwen 3.7. Best OS LLM.

5h1231

α alias Dwarf Star ✨@deeperflows

@victormustar Who expected this many drops within a single week?

3h561

Virgil Maro@_virgil19

@victormustar 25 frontier drops in one week means the weights are the commodity now. whatever's scarce moved up a layer, to the harness and evals that make one actually usable

15h531

Kekko D’Amato@kekkodamato_

@victormustar The gap between open and closed is collapsing faster than most timelines assumed. 550B hybrid MoE with 1M context dropping open-weight changes the calculus around proprietary model moats significantly. What used to be a 12-month lag is now weeks.

12h167