I think we’ve reached the point where normal people can’t really determine whether new models are better than previous ones. Like Fable doesn’t seem that much better to me, but every 150 IQ person I know is like “wow the singularity came sooner than I thought”.
Users Struggle to Judge New AI Models as Experts Spot Singularity Signs
Many users dismissed claims about new AI models and singularity signs as meaningless hype or self-soothing, while some praised the models for being faster and more capable.
Most Activity

@citrini "To appreciate talent you have to be within a standard deviation of that talent"

@stevehou I’m talking about engineers and people who are working on serious problems vs just finance bros trying to securitize new shit

@citrini @VentureCoinist I have a harness that makes a plan and tries to improve it iteratively until the plan cannot easily be improved
Usually you hit a top score where iterations become noise. Fable can break the previous ceiling pretty aggressively
Have to get used to being a meat puppet

@citrini How many "150 IQ person"s do you know?

It’s contextual.
Fable when asking a normal question via web or mobile? Very minor improvement not noticeable.
Fable when in IDE code environment and poking through your code? Takes initiative when finding issues, solves it, validates with testing without being asked to ensure no errors.
Noticeable difference in its vertical knowledge and attention to detail

@citrini LLMs need to integrate timestamps on user messages to better establish context and intuit linear time. I have stopped using LLMs for the most part because having to constantly clarify “today vs yesterday” is maddening.

@citrini this is a bear case for the labs themselves. real-world tasks getting saturated by the frontier, and china will catch up in a few months. hard for customers to justify buying SOTA and hard for the labs to justify selling anything but SOTA.

@citrini It's better at more substantial or open ended tasks, and it's not actually much better at smaller tasks. It's only when you run it in a loop with large context that you notice the difference.

@citrini 150 IQ or middle aged token addict?

@citrini Am I one of them?

@citrini Lol

@citrini I told fable to fix a bug by looking at the source code and instead of just reading the code which opus 4.7 4.8 would do, it wrote instrumentation code and ran the app.

@citrini @edison0xyz Algorithm doing its thing

@citrini @stevehou 👀

@DarkPoolTA @citrini they dont do this because it makes it so you can't cache tokens

@citrini Its crippled to probably under the level of Opus 4.8 with highest levels of reasoning.
Would like to try the b2b Mythos model however.
There was no reason to release Fable, and for just 14 days??

@goodalexander @citrini @VentureCoinist hey claude jerk it a little

@DarkPoolTA @citrini

@citrini If you believe in the singularity you are not 150 IQ.

@citrini My strategy plan.
⤵️