9h ago

GPT-5.5 achieves 99.46 percent accuracy on multi-digit multiplication across a 20-by-20 grid of problems with up to 20 digits per number

Medium reasoning effort produced near-complete heatmap coverage versus low accuracy without it.

0
Original post

I redid the multi-digit multiplication experiment, now with gpt-5.5. With medium reasoning and 7 samples each cell, it pretty much aced the test with 99.46% accuracy. The model had no tools to call and had to rely on its reasoning. Can it go further? (1/4)

1:24 AM · May 22, 2026 View on X

I still occasionally hear people claim that LLMs are hilariously bad at arithmetic. Another reminder that it's not 2022 anymore.

cozyblazecozyblaze@cozyblaze265065

I redid the multi-digit multiplication experiment, now with gpt-5.5. With medium reasoning and 7 samples each cell, it pretty much aced the test with 99.46% accuracy. The model had no tools to call and had to rely on its reasoning. Can it go further? (1/4)

8:24 AM · May 22, 2026 · 78.7K Views
3:03 PM · May 22, 2026 · 1.8K Views