Do I understand it correctly that the OLMo from-scratch series is coming to an end?
If so, looks like NVIDIA stepped up just in time with Nemotron models as the only remaining fully-open (ie not just weight drop) from-scratch LLM team.
AI Judge changed title after evaluation, original title: "Meta's Lucas Beyer questions if open-source LLM pretraining is ending, prompting details of Marin's 6x speedup"
Stanford's Marin project remains another fully open training framework
Do I understand it correctly that the OLMo from-scratch series is coming to an end?
If so, looks like NVIDIA stepped up just in time with Nemotron models as the only remaining fully-open (ie not just weight drop) from-scratch LLM team.
Users praise NVIDIA Nemotron and Stanford Marin for fully open-sourcing LLM training while others worry about OLMo vanishing along with its shared training data.
one of my favorite projects is Marin from the stanford folks, they have a scientific approach to training, are ready to take risks and are fully open (even open development where you can follow everything on github!)
https://github.com/marin-community/marin
Do I understand it correctly that the OLMo from-scratch series is coming to an end?
If so, looks like NVIDIA stepped up just in time with Nemotron models as the only remaining fully-open (ie not just weight drop) from-scratch LLM team.
@giffmana Great fully open pretraining efforts going on at Marin!
There are two types of advances: (i) a singular change that provides 3x and (ii) a series of micro changes that each provide 20%. It is easy to celebrate (i), but (ii) is just as important, and the hard part is making sure the improvements stack. We care about both in Marin.
@eliebakouch Yeah they are interesting, but my understanding is that they haven't *finished* (as in also mid and post) anything yet?
one of my favorite projects is Marin from the stanford folks, they have a scientific approach to training, are ready to take risks and are fully open (even open development where you can follow everything on github!)
https://github.com/marin-community/marin

@giffmana i'd say don't count olmo out just yet :)

Two things make me think that:
1) I remember not too long ago some comm's about the new strategy, which sounded much more application/post-train focussed to me (sorry don't remember where, but from ai2)
2) large portion of the team gone
That being said, I'm an OLMo fan for the openness and reasonable quality, so I would be happy if my impression is proven wrong!
@eliebakouch they just need a twitter account, it’s so hard to link/find to their work 🥲
one of my favorite projects is Marin from the stanford folks, they have a scientific approach to training, are ready to take risks and are fully open (even open development where you can follow everything on github!)
https://github.com/marin-community/marin
@giffmana yes true, quite excited for them to tackle post training the same way they do for pre training!
@eliebakouch Yeah they are interesting, but my understanding is that they haven't *finished* (as in also mid and post) anything yet?
Do I understand it correctly that the OLMo from-scratch series is coming to an end?
If so, looks like NVIDIA stepped up just in time with Nemotron models as the only remaining fully-open (ie not just weight drop) from-scratch LLM team.

@giffmana I hope not! I really appreciate NVIDIA’s recent efforts across several areas of open-source AI, but I would hate to see OLMo disappear. It has been such a valuable open-source project, built by an incredibly talented group of people.

@giffmana Arcee?

@giffmana Apertus?🇨🇭 I heard there will be a v2.

@giffmana stay tuned. lots of exciting work underway is all i can say rn, and I think the community will be pleased with the outputs

@giffmana Haven't looked into it too deeply but isn't stepfun also fully open?

@giffmana Did you see project Marin?

@giffmana Apertus?

@giffmana K2 from @IFM_MBZUAI

@giffmana They should get Nathan on board

@bygregorr @giffmana Pre and post Data, software, recipes, models (base, post, reward), full technical report
Hard to be more open and Nemotron

@giffmana not sure nemotron clears that bar the same way olmo did. olmo dropped the full pretraining dataset too, does nemotron actually publish the training data or just weights + recipes?

@giffmana 💚
AI Judge changed title after evaluation, original title: "Meta's Lucas Beyer questions if open-source LLM pretraining is ending, prompting details of Marin's 6x speedup"
Stanford's Marin project remains another fully open training framework
Do I understand it correctly that the OLMo from-scratch series is coming to an end?
If so, looks like NVIDIA stepped up just in time with Nemotron models as the only remaining fully-open (ie not just weight drop) from-scratch LLM team.