@scaling01 5.5 has really good post training OpenAI sucks at pre training Scars from GPT 4.5 haven't healed
@zephyr_z9 so GPT-5.5 sucks for its size and training a 3-5x larger model is impossible for them?
Lisan al Gaib countered that GPT-5.5 is an experimental GB200 iteration.
@scaling01 5.5 has really good post training OpenAI sucks at pre training Scars from GPT 4.5 haven't healed
@zephyr_z9 so GPT-5.5 sucks for its size and training a 3-5x larger model is impossible for them?
@zephyr_z9 I knew you would mention GPT-4.5; because why else would you make that statement?
if GPT-4.5 is your only data point then you are overindexing
I see GPT-5.3, GPT-5.4 and GPT-5.5 all as experiments for training on GB200s
the next step, another scaled up model, is natural
@scaling01 5.5 has really good post training OpenAI sucks at pre training Scars from GPT 4.5 haven't healed
@scaling01 Yes, they are training a big model for gpt-6 but it ain't 15T-20T
@zephyr_z9 I knew you would mention GPT-4.5; because why else would you make that statement?
if GPT-4.5 is your only data point then you are overindexing
I see GPT-5.3, GPT-5.4 and GPT-5.5 all as experiments for training on GB200s
the next step, another scaled up model, is natural
@zephyr_z9 @scaling01 noob question but how can one suck at pretraining
what makes it so hard?
in my mind if you have a great dataset and press the button it just works?
@scaling01 5.5 has really good post training OpenAI sucks at pre training Scars from GPT 4.5 haven't healed

@zephyr_z9 @scaling01 They will be fine with the amount of compute they are gonna have they can train many few trillion models at once and then choose the best one