/Tech3h ago

Researcher Questions Use of Smaller Models for Routine Tasks

41329123422.3K

Original post

I think the assumption that you should use smaller models for less important tasks is flawed (or at least deserves much more careful consideration). Big models are generally better at everything but cost, so it is worth considering whether gains in non-key tasks would be valuable

11:06 AM · Jun 13, 2026 · 16.3K Views

Sentiment

Negative users criticized questioning smaller models for routine tasks as cheerleading that ignores energy costs and fails to save money, while positive users agreed quality matters more than cost-cutting.

Pos

33.3%

Neg

66.7%

10 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS6KBOOKMARKS6LIKES51RETWEETS4REPLIES6

Ethan Mollick@emollick

In fact, areas that are not important in your organization could benefit even more from applying smarter models, since the organizational capabilities & human capital in those areas are weak, and good AI might be a cheap way to increase capacity!

Ethan Mollick@emollick

3h6K516

Jackson Atkins@JacksonAtkinsX

@emollick The “but cost” is doing a lot of work in that sentence especially when you pay API pricing.

3h681

Uri@urivalev

@JacksonAtkinsX @emollick why did he block you wtf

3h12

Tendies Of Wisdom@TendiesOfWisdom

@emollick It's not the quantity of tokens - it's the quality that matters 🔥🚀

2h31

blake@_buggles

@emollick Completely agree that the token optimizing cost cutting view is premature. We're still so early in this, and the CFOs are already trying to optimize abundance. I think it's fundamentally a category error on the opportunity of AI.

2h8

Brad Flaugher@BradFlaugher

@emollick I think "always use a big model" works for individual users but doesn't work at scale... but "pick the best model along the pareto frontier for your task/budget does" in narrows down model selection from hundreds to maybe 10 once you include on-device ones like Gemma.

2h802

Andrew Kirby@ajkirby01

@emollick What’s the point of getting 3 cheaper wrong answers for the price of a right answer?

3h302

Ethan Mollick@emollick

@JacksonAtkinsX It is not doing a lot of work. And neither is your prompt. Blocked.

3h571

AI Research Tools 👨‍🎓 🧬🧪 🐬 🔬@airesearchtools

@emollick Agree

3h511

Uri@urivalev

@emollick @JacksonAtkinsX i am surprised you think you can identify an llm in a 10 word sentence

fwiw that sentence made sense to me and the account looks legit

these days i try to judge my content, not using tropes or not

(it can also be an llm assisted human)

3h291

Bill The Investor@billtheinvestor

@emollick Small model latency and throughput often negate cost advantages when integrating LLMs into real-time, high-frequency agentic workflows.

2h87

filipe@filicroval

@emollick a lot of ‘easy’ tasks still have surprisingly high variance in real-world inputs.

still, a setup that i like is working heavy tasks with stronger model and delegating smaller ones to Sonnet. it has been working well for me!

3h61

Sudhir Gajre@SudhirGajre

@emollick In enterprise settings processing 1000s of docs, defaulting to frontier models is often wasteful. Basic extraction tasks work great with optimized smaller models. Measure with strong evals, then use GEPA-style optimizers to close most of the quality gap at 5-20x lower cost.

3h59

nosimus@nosimus1

@emollick Depends on the task. Some smaller models are better at particular things than frontier models. For example, I recently switched a workflow that involved examining images and doing web searches from gpt 5.5 to Gemini Flash 3.0 and the output improved by 4x (rate of success)

3h58

tsunami_crypto@ls_brd

@emollick this assumes task importance stays fixed while the model handles it. sometimes a small model changes what tasks matter.

3h51

Blue@blueshopping24

@emollick Worth noting: in agentic chains the 'unimportant' task is often upstream of the important one. A cheap model's small error compounds downstream — the real cost isn't tokens, it's quality decay you only catch later.

3h41