Google's Gemini 3.5 Flash records 1656 Elo on GDPval-AA matching M4o 2.5 Flash while trailing GPT-5 and Claude-Opus 4.7 · Digg

Google's Gemini 3.5 Flash records 1656 Elo on GDPval-AA matching M4o 2.5 Flash while trailing GPT-5 and Claude-Opus 4.7 · Digg

Posts from X

Most Activity

VIEWS1.3M

Michael Truell@mntruell

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval.

We’ll keep updating the leaderboard as new models come out.

https://cursor.com/evals

41d1.3M1.2K209

BOOKMARKS1.1K

Theo - t3.gg@theo

Oh my god it scored worse than Composer 2! Not even 2.5! And it cost 4x more to run!!!

This might be the worst major lab model drop of all time. Llama 4 tier. Insane.

Michael Truell@mntruell

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval.

We’ll keep updating the leaderboard as new models come out.

https://cursor.com/evals

41d1M5.7K1.1K

LIKES9.3KRETWEETS918

Google@Google

Meet Gemini 3.5 Flash — our strongest agentic and coding model yet.

It delivers frontier-level performance at 4x the speed of comparable frontier models — often at less than half the cost.

Generally available, starting today. 🧵

#GoogleIO

41d776.7K9.3K1K

REPLIES776

Logan Kilpatrick@OfficialLoganK

Gemini 3.5 feels like the start of a new era for Gemini, we spent the last 2.5 years putting the infrastructure, products, team, etc in place (learning lots of lessons along the way).

The model is the product, please keep the feedback coming!

40d203K2.3K145

Logan Kilpatrick@OfficialLoganK

Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own.

We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!

41d546K7.1K875

Sundar Pichai@sundarpichai

Just off stage at #GoogleIO, some highlights from this morning 🧵

Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs.

Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also comparable to the best models but very fast (4x faster tokens/ second than other frontier models). And when looking at the intelligence versus output speed, it’s in a league of its own in the top right quadrant.

41d293.4K3.8K430

Google DeepMind@GoogleDeepMind

Introducing Gemini 3.5: our newest family of models combining frontier intelligence with real-world action.

The first release is 3.5 Flash, our strongest model yet for agents and coding 🧵

41d707.3K3.6K466

Google@Google

Gemini 3.5 Flash is built to help you execute complex, agentic workflows.

3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.

Google@Google

Meet Gemini 3.5 Flash — our strongest agentic and coding model yet.

It delivers frontier-level performance at 4x the speed of comparable frontier models — often at less than half the cost.

Generally available, starting today. 🧵

#GoogleIO

41d935.9K2.3K635

kache@yacineMTB

Holy shit man

Google@Google

Gemini 3.5 Flash is built to help you execute complex, agentic workflows.

3.5 Flash rivals flagship models to deliver frontier performance for agents and coding, at the lightning speeds you expect from the Flash series.

41d435.2K2.4K635

Demis Hassabis@demishassabis

Gemini 3.5 Flash is amazing!

- Performs better than 3.1 Pro on coding & agentic tasks - 4x faster than other frontier models - 12x faster in @antigravity - 800 tokens/sec! - Often at less than half the cost

And Pro to come…

Try it in @antigravity, @GeminiApp & more - enjoy!

41d196.3K3.1K246

Google@Google

Gemini Omni Flash is rolling out starting today.

Here’s where you can find it:

🔹 Today: Google AI Plus, Pro and Ultra subscribers globally in the @GeminiApp and @FlowbyGoogle .

🔹Rolling out starting this week, for no cost: @YouTube Shorts and the YouTube Create app.

🔹Coming weeks: developers and enterprise customers via APIs.

#GoogleIO

41d154.5K1.9K295

Lisan al Gaib@scaling01

bruh

41d232.2K2.2K286

Logan Kilpatrick@OfficialLoganK

Gemini 3.5 Flash ranks #1 on the APEX-Agents-AA benchmark, outperforming much larger models a whole size above it.

39d465.5K1.7K135

Theo - t3.gg@theo

Wait wtf, they STILL haven't updated pretraining???

0.005 Seconds (3/694)@seconds_0

gemini 3.5 knowledge cutoff is jan 2025 (TWENTY TWENTY FIVE) [17 months ago]

Its cool that its a refinement of the 3 pro base, but insane that they still havent pretrained a new model to release

40d257.9K2.5K146

Jeff Dean@JeffDean

1/ Today at #GoogleIO, we’re releasing Gemini 3.5, our latest family of models combining frontier intelligence with action.

We’re starting by releasing 3.5 Flash, which is built to help you execute complex, long-horizon agentic workflows.

Gemini 3.5 Flash is our strongest model for coding and agent http://yet.It outscores 3.1 Pro on agentic and coding benchmarks like Terminal-Bench and MCP Atlas, while running 4x faster than other frontier models.

Used in Google Antigravity, 3.5 Flash is even further optimized to be up to 12x faster. It’s a powerful engine to deploy sub-agents that collaborate, run high-frequency iterative loops, and solve real-world problems at scale.

Some highlights we’re excited about 🔽

41d119K1.5K228

Theo - t3.gg@theo

Gemini 3.5 Flash is a really interesting release.

It's super fast and surprisingly smart. It's also more expensive (3x more per token) and super token hungry.

The result - it costs 2x more to run than Gemini 3.1 Pro on similar tasks. It's more expensive than GPT-5.5 Medium.

41d153.2K1.8K210

Lisan al Gaib@scaling01

GPT-5.5-medium has lower end-to-end latency, uses less tokens and is overall smarter and cheaper than Gemini 3.5 Flash

it might genuinely be over for anyone not named OpenAI or Anthropic

Lisan al Gaib@scaling01

some more Gemini 3.5 Flash benchmarks by Artificial Analysis AI: - the APEX-Agents-AA score is excellent - expected higher on CritPt - reasoning efficiency could also be better, but it kind of depends on what setting they used. if it's the max then it's very good - Price/Perf looks pretty bad vs GPT-5.5. You can get GPT-5.5-medium with better performance for less and with faster responses

41d172.1K1.5K236

Theo - t3.gg@theo

Clearly these people haven’t experienced the magic of Gemini 3.5 Flash Preview on High in the new Antigravity CLI

BladeoftheSun@BladeoftheS

Google CEO tries to tell University students to love AI.

They tell him to BOO off.

This is what most people think of the hated AI, we don't want it.

41d157.1K1.8K178

Logan Kilpatrick@OfficialLoganK

Massive updates to @GoogleAIStudio and the Gemini API 🤯

- Gemini 3.5 Flash! - managed agents so you can easily build agentic products with the antigravity harness - native Android app creation right in AI Studio - native workspace integrations - 1 click export to antigravity

41d67.1K1.4K158

INIYSA@lafaiel

Gemini 3.5 is 30x more expensive than 1.5

41d306.7K1.6K148