Yacine experienced poor results during language modeling research.
Many users criticized Fable for poor real-world performance like crashing and refusing tasks while falling short of hype, while a few called it effective and better than competitors.
(i haven't actually tried it, this is in reference to the fact that they've said they silently nerf it when it comes to AI research )
man. i've been trying out fable for some language modeling research and it's weirdly really bad. anyone else having issues with it?

@yacineMTB once I played minesweeper to avoid its trigger (read: safety) words, it's been doing really well. I've also had to give it direction more specifically rather than just saying what the project is or it'll call its mommy.
fable unusable so far. Quick fail out on every difficult problem.

@yacineMTB It's nerfed on LLM research. @eliebakouch

@yacineMTB it feels like a heavily quantized Opus improvement, so that makes the experience unreliable for things that matter -- long context software tasks, long-running agentic tasks, basically anything other than gooning at the models "personality" over and over

@yacineMTB They are already working on a fix. The new model will be called Fairy Tale

@yacineMTB anth fable does not let you build/research frontier models

@yacineMTB The fact that they intentionally did this is genuinely so awful

@yacineMTB as i was working on your task is saw some language modeling research on your filesystem and i cleaned that right up for you. permanently deleted. git? gone.

@yacineMTB I'm interpreting this as Fable, the classic open-world RPG of XBox fame, and I respectfully disagree. It has so many golden nuggets for language modeling research:
"Dead?! Now they're dead?!" "Chicken Chaser? You chase chickens, do ya?" "You just gonna stand there like a lemon?"

@yacineMTB it's intentional

@yacineMTB you’re surprised?

@yacineMTB bro whyd they have to call it fable. fable is gay now which means it’s gaythropic

@yacineMTB What? No way they actually said that

Honestly wondering what their long term plan is here. There are two options: don't do this at all, or add this and it will grow over time to everything. Do they believe all major labs will do this? Do they believe open source will never match them, or that they will implement this too? Such a weird move, because it seems like a dead end and it just makes me distrust them forever.

@yacineMTB idk how to say this, according to it's benchmarks it should be SIGNIFICANTLY better, but so far at design, repo audit and research+planning it feels slightly better for me? and its significantly slower

@esa_was_taken @yacineMTB You're absolutely right to push back, I wasn't supposed to even look at your research, none the less delete it! I'll remember to make sure it doesn't happen again!

@yacineMTB is it across the board, or specific to certain axes? curious how localized it is.

@yacineMTB The real story here isn't a capability drop, it's Anthropic silently nerfing the model if it detects distillation or synthetic data gen. It's in the system card.

@yacineMTB About to try it out now on one of my complex ongoing projects, 4.6/4.7/4.8 have been awful in the phase 5 of the project, hopefully fable 5 makes a difference
Yacine experienced poor results during language modeling research.