7h ago

Gemini Flash 3.5 matches Sonnet-class models on agentic tasks but costs 7.46 times more than GPT-5.5 at $22.96 per run on PencilPuzzleBench

Excessive verbosity drives context overflows and higher expenses.

0
Original post

Day 2 Vibes On Gemini Flash 3.5 - Sonnet class model - More expensive than Sonnet in real-world usage - GPT 5.5 & Sonnet/Opus still maintain lead Real problem - It's just too expensive as it spins on agentic problems

10:21 AM · May 20, 2026 View on X

what the actual fuck

Gemini 3.5 Flash is 7.46 times more EXPENSIVE than GPT-5.5-xhigh on PencilPuzzleBench

(direct ask scores are below gpt-5.2-high)

7:11 PM · May 20, 2026 · 21.1K Views

the agentic score by itself is fine

but the cost is not real

Lisan al GaibLisan al Gaib@scaling01

what the actual fuck Gemini 3.5 Flash is 7.46 times more EXPENSIVE than GPT-5.5-xhigh on PencilPuzzleBench (direct ask scores are below gpt-5.2-high)

7:11 PM · May 20, 2026 · 21.1K Views
7:17 PM · May 20, 2026 · 1.6K Views