@eliebakouch @giffmana We use both megatron-lm and megatron bridge. Pretraining and SFT for Nemotron 3 was all done in megatron-lm.
But we did move as much functionality as possible into Megatron-core, in order to make it easier to integrate Megatron features into other codebases.
@giffmana torchtitan, olmo-core are great!
also worth noting that i think the nvidia team doesn't use megatron-lm to train models anymore, they use megatron bridge (which is based on megatron-core, a submodule of megatron-lm)