/Tech2h ago

Mikhail Parakhin's testing finds GPT-5.6 Max outperforms Opus 4.8 overall but trails Fable 5 on coding

GPT-5.6 Max beat Fable 5 on agentic workflows.

911.9K40420246.7K

Original post

Not as relevant now :-(: I had an opportunity to deeply test both Fable 5 and GPT-5.6 Max. 5.6 is clearly better than Opus 4.8 at everything (slightly faster, too, though that depends on the load). Vis-a-vie Fable, it is clearly worse on coding, but better on agentic workloads. I had Fable write code, 5.6 run experiments - dreamy…

9:35 PM · Jun 26, 2026 · 225.5K Views

Sentiment

Positive users appreciate the benchmark details comparing GPT-5.6 performance against Opus 4.8 and Fable 5, while negative users complain that the models remain inaccessible due to release and policy barriers.

Pos

50.0%

Neg

50.0%

10 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS12.7K

Sebastian Szturo@SebastianSzturo

@MParakhin Now tell us how in the world you get access to both!?

14h12.7K22

BOOKMARKS36

Mikhail Parakhin@MParakhin

@veeransg5 No, ultra = workflows on high, the trick is to use max and then say: "Please start workflows with multiple agents..." - agents inherit max effort, makes a big difference.

13h6.3K4536

LIKES80REPLIES3

Mikhail Parakhin@MParakhin

@AlanRBlair I did. On agentic non-coding (taking actions) I found 5.6 clearly better. On discussing history/math Fable has an edge.

14h7.3K807

RETWEETS2

Mikhail Parakhin@MParakhin

@manabiSRS Ultra is just multiple agents, but each lower reasoning effort - high. You can start them from Max - then they inherit Max, makes a difference.

13h3.5K3112

Mikhail Parakhin@MParakhin

@Khalin_George Oh yeah, endless critique loop. Fable is very good, so, maybe only 3 iterations. 5.6 makes far fewer bugs than 4.8, but is no Fable - so, 7-8, even 9 iterations "review changes, find bugs - fix - review changes, find bugs - fix, ..."

13h10.7K3715

Mikhail Parakhin@MParakhin

@SebastianSzturo Testing a preview. Don't have access to either anymore :-(

13h10.9K792

Alan Blair@AlanRBlair

@MParakhin Tell us more! No one else has tested both and spoken about it!

Any chance you used both for non-coding applications?

15h8.6K19

Mikhail Parakhin@MParakhin

@MKuliasov “Take these files with results of previous experiments, parse out x, y, z, analyze, figure out what went wrong, prepare new runs, schedule these machines, be careful with the machine X - it is running this other experiment, keep iterating”

5h1.6K52

Alan Blair@AlanRBlair

@MParakhin Amazing, thank you. Good to know someone can snip at Fable's heels.

The 2 days I used Fable; it did have a bit of that Big Model Smell - same vibes with 5.6? Or just an improvement in 5.5?

14h1K2

Krzysztof Gonia@kgonia7

@MParakhin What's the feeling of working with GPT-5.6 comapring to Fable? I hate talking with LLMs outside of just work, but Fable, even in conversations about coding, gave a nice feeling of working with something that is capable, understands nuance, and is even empathetic.

12h4.2K11

Mikhail Parakhin@MParakhin

@david_saint_ 4.8 is better, these tests are all well documented: https://toloka.ai/arena

5h46722