@sama this would be an absolute game changer if you can actually pull this. i dunno how it’ll scale infra wise, sounds hard.
but would instantly change behaviors for speed.
oh and also...750 token/sec coming to 5.6 sol in july!
The performance update will run on 5.6 sol hardware
@sama this would be an absolute game changer if you can actually pull this. i dunno how it’ll scale infra wise, sounds hard.
but would instantly change behaviors for speed.
oh and also...750 token/sec coming to 5.6 sol in july!
Positive users are excited about OpenAI's teased 750 tokens/sec frontier model inference speed due to major practical gains, while negative users resent the teasing without accessible rollout.
No Digg Deeper questions have been answered for this story yet.
experiencing a frontier model running at 750 tokens/sec gives you the same sense of wonder as seeing AI for the first time. excited to ship this soon!
oh and also...750 token/sec coming to 5.6 sol in july!
does anybody else have the problem that it scrolls way too fast
I've given the effort to code up of smooth scrolling terminal hi fps, but it's quite a hairball
When something moves so fast don't bother trying to read it as it's being written
Read it at your pace
experiencing a frontier model running at 750 tokens/sec gives you the same sense of wonder as seeing AI for the first time. excited to ship this soon!

@DanielleFong

@stevenheidel I wouldn’t know what that’s like. 🙄 what’s up with OAI employees dangling these carrots in front of their large user base that has no access? It’s frustrating!

@stevenheidel nice. insane this is working right now... hope your getting loads done with it autoresearching stuff lol. so much i would make with that

@stevenheidel if people can't use it, this conversation is useless

@stevenheidel Very excited for it!

@stevenheidel A taste of the future!

@dioscuri blonde blonde, brunette. rm -rf

@stevenheidel Excited to ship speed to a walled garden. 750 tok/s means nothing when the API is politically gated. It's hard to feel excitement about 'shipping the future' when the API is explicitly designed to restrict it. You're just measuring how fast the digital divide is accelerating?

@signulll @sama 750MW provisionned specially for OpenAI according to Cerebras, I do not know if it is enough ? I have no idea

@stevenheidel How much does it cost to use $CBRS

@stevenheidel very exciting to the trusted partners indeed

@stevenheidel /goal will be insane on 5.6 sol ultra lol

@stevenheidel @dkundel This will be another level when it’s cranking that consistently while cooking in Codex. Looking forward to that!

@stevenheidel Only for the elite

@stevenheidel How much more expensive though? 🫣

@stevenheidel Good I’m tired of latency I’m wasting all my time watching LLMs think

@stevenheidel This is the time when CPUs are now pushed to their maximum capability. It wasn't the case before with normal coding.

@stevenheidel spark was amazing but 5.3 a bit old will be incredible w sota