Today we're releasing Mellum2: our first "serious" LLM.
This is a 12B A2.5B MoE LLM pre-trained on ~11T tokens and post-trained with RLVR. I'm proud to be leading the team that was working on it for the last 6 months.
We release base/SFT/RL checkpoints along with a tech report












