@woke8yearold pretty good midwit test for when someone was unimpressed with fable
Runway founder Siqi Chen and engineer Alex Graveley argue that unimpressed reactions to Fable serve as a 'midwit test
The online exchange offers no technical details about Fable.
Positive users praise Fable for handling complex tasks without errors like the first real model, while negative users call it verbose and overpriced and dismiss the surrounding hype.
No Digg Deeper questions have been answered for this story yet.
Most Activity

@tszzl @woke8yearold Relatedly (pretty cold take by now)

@tszzl @woke8yearold Yeah I've been overly polite about it but anyone who was not impressed with Fable is clearly not even attempting to rigorously evaluate what they can and can't do

@tszzl @blader @woke8yearold if you drive at 5mph, not much diff between a volvo and a ferrari

@tszzl @woke8yearold

@tszzl @woke8yearold my bond with 5.5 has gotten so tight I didn't feel the fable hype ngl

@ayedtay @tszzl It actually stunned me to see how many people thought it was a slightly better Opus

@tszzl @woke8yearold gulp

@tszzl @woke8yearold I am still irritated at Claude.

@tszzl @woke8yearold at least one major exception

unusually smart like... I was building a trading bot.. I'm pretty well already having optimized it with the state of the art agents, pro research into the latest series of codexes and Claudes xhigh agents. It's actually something I've been working on myself for a good five years too.
It managed to do a sequential amount of five tasks each of them getting better results on the the kfold validation.
It's quite shocking. like that's the actual research tasks that it's choosing to do are smart ... I used to think there was kind of a ceiling like surely the signal to noise ratio is just huge so you've always going to like be a few steps back.
Like if you think about searching the symbol universe of things that could be traded it's not particularly differentiable or obviously differentiable so most of the time you're just trying symbols and it's coin flipping.
Fable did not go after the coin flipping path.
Each time required some kind of deep novel insight into how gradient boosted trees work or how time series transformers work, how risk works.
Now it's gone and I'm back to coin flipping my way through lol.
Missing fable so much

@tszzl @woke8yearold fable was a high taste model, but not uniquely capable. just good at one-shotting things without having its hand held. going back to 5.5 didn't feel crippling for me, more like dropping one level of abstraction

are you still going to be impressed with it in 3 years? 5 years? 10 years? the models today are basically crap. less than garbage... you literally can not do most economic roles a human can for the price point esp in terms of efficiency. Esp long horizon or real time positions or work. Great for creating code and cool projects often in one shot, and certain synthesis but lets not get ahead of ourselves.
if you claim otherwise, may i remind you you risk freaking out people prematurely esp who don't know any better *glances at admin/ regulators*
you should think of todays models as nothing better than clever multimodal coding chatbots++.
No native video modality esp input, expensive as hell, slow as hell (Tokens per second is still in the tens to dozens per second range.) nah bro, its great for the day and relative to recent models sure; but lets calm down a bit and be a bit realistic and grounded in our expectations. We were promised agi... Bar should be a bit higher imo ಠ_ಠ'

Roon pretty please give us an upgrade in Codex that beats Fable I beg you. We are all waiting and I think it’s a defining moment. If you wait too long to release a par or better one, Anthropic will defacto be seen as the lead by how much time gap you delay. I just quit my job to work on AI.

@tszzl @woke8yearold What were you impressed by? I only got a chance to chat with it and it was strangely verbose and unable to stop.
My vibe about it was that it was highly capable for complex long range task but that it was somehow less predicated to avoid problems hence bad at basic chatting

@tszzl @woke8yearold Idk I ran 5 ambitious research/planning/improvement prompts with both 5.5 xhigh and fable and they both had the same strategy/direction/plan basically no gains from fable.
Maybe I just need to find harder things to build 😭

@OG_Jaybird @tszzl @woke8yearold This is totally a Grok question

@tszzl @woke8yearold Tell sama for me that we need to increase the model parameters to at least 10 trillion. The current model parameters are way too low. They can do the work and are very obedient, but their comprehension and insight are terrible

@tszzl @woke8yearold I've stopped being impressed because a better one will be along in a few months. I have limited "impressed" supplies and need to save them up.

@tszzl @woke8yearold my one experience with it was I asked it to do a code review (of a reasonably small PR), and before it could finish it ran out of fable tokens and switched back to opus 4.8
🤷

@noobpsyborg42 @tszzl @woke8yearold Nobody else is going to release a Fable-or-better model until they see where the USG/Anthropic squabble ends up, even if they had a model that good otherwise ready to go.