/Tech9h ago

Gergely Orosz, author of The Pragmatic Engineer, is seeking information on smart model routers that dynamically optimize inference costs

Story Overview

Gergely Orosz is asking the community for pointers on services that automatically pick the cheapest suitable model for each AI query, underscoring how teams juggling several providers now face mounting inference bills and want easier ways to switch without rewriting code.

4217769526K

#1443

Original post

Gergely Orosz@GergelyOrosz#1443inTech

What are "smart" model routers you know of? Services or vendors that take queries and route the most efficient model they deem, saving cost.

I sense there is a massive demand for these, and will be even more...

4:58 AM · Jun 11, 2026 · 20.3K Views

/Tech9h ago

Gergely Orosz, author of The Pragmatic Engineer, is seeking information on smart model routers that dynamically optimize inference costs

Story Overview

4217769526K

#1443

Original post

Gergely Orosz@GergelyOrosz#1443inTech

What are "smart" model routers you know of? Services or vendors that take queries and route the most efficient model they deem, saving cost.

I sense there is a massive demand for these, and will be even more...

4:58 AM · Jun 11, 2026 · 20.3K Views

Developer Impact

Existing routers already handle the heavy lifting

Factory Router, Not Diamond, Prism, and OpenRouter’s auto mode each claim they can balance cost against quality or latency on the fly, though exact savings and production scale are still emerging.

Open Question

Demand is loud but adoption numbers stay quiet

Orosz and replies note rising interest tied to provider risk and workload growth, yet no public figures show how many teams have switched or how much they have saved so far.

Sentiment

Positive users praise contributors to Not Diamond while negative users accuse Copilot of deliberately selecting expensive models, question Cognition's integrity over misleading Devin claims, and call the vendor landscape a mess.

Pos

33.3%

Neg

66.7%

7 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS5KBOOKMARKS42LIKES47REPLIES8

Gergely Orosz@GergelyOrosz

Solutions I collected so far (no affiliation with either):

- Factory Router - Not Diamond - Prism by Augment Code

AI gateways with routing:

- OpenRouter (auto router) - Kilo Gateway - Requestly - LiteLLM (auto routing)

More pointers welcome!

Gergely Orosz@GergelyOrosz

What are "smart" model routers you know of? Services or vendors that take queries and route the most efficient model they deem, saving cost.

I sense there is a massive demand for these, and will be even more...

8h5K4742

RETWEETS1

Matthew Berman@MatthewBerman

@GergelyOrosz @tomas_hk

Gergely Orosz@GergelyOrosz

What are "smart" model routers you know of? Services or vendors that take queries and route the most efficient model they deem, saving cost.

I sense there is a massive demand for these, and will be even more...

5h79981

Gergely Orosz@GergelyOrosz

@dabit3 @cognition yes but I have trust issues with Devin, given the company lied about Devin's capabilities on launch, never addressed it, apologized or corrected it

just ignoring Cognition till it happens. 2 years and counting I think

8h49871

nader dabit@dabit3

@GergelyOrosz @cognition Devin

8h4143

Marc-André Moreau@awakecoding

@GergelyOrosz OpenRouter has one: https://openrouter.ai/openrouter/auto

9h1152

Matthew Lam@mattlam_

@GergelyOrosz Cognition has adaptive routing right @dabit3

8h1142

Fer@harderthanfire

@GergelyOrosz Morph also have one: https://docs.morphllm.com/sdk/components/router

8h1441

VLAD ARBATOV@vladzima

@GergelyOrosz routellm from lmsys is the open source take, martian and notdiamond sell it as a product. NB: router has to judge difficulty with a cheap model, and it misjudges where routing matters

8h1071

Gergely Orosz@GergelyOrosz

@tsmaeder @cursor_ai is there an inch of documentation on this? found nothing on their site

7h106

Gergely Orosz@GergelyOrosz

@harderthanfire unclear to me if this works only with their models tho. reads like it?

8h96

Gergely Orosz@GergelyOrosz

@vladzima routellm has not been updated in 2 years tho?

8h91

Thomas Mäder 🇨🇭🇨🇦 🇺🇦@tsmaeder

@GergelyOrosz @cursor_ai https://cursor.com/docs/models-and-pricing

6h241

Fer@harderthanfire

@GergelyOrosz This is their example model selection and you can specify what provider and/or specific models:

8h171

yenkel@yenkel

@GergelyOrosz not diamond

@tomas_hk is a beast

8h1383

Tomas Hernando Kofman@tomas_hk

@GergelyOrosz Thanks for the s/o, have been working on this problem since 2023. For folks interested in learning more, we've written our learnings on smart routing, together with what distinguishes it from gateways and deterministic routing, here: https://www.notdiamond.ai/blog/a-comprehensive-guide-to-model-routing

5h414

Gergely Orosz@GergelyOrosz

@tsmaeder @cursor_ai ah so this is flat pricing for models THEY choose. not exactly optimising models you have!

6h29

Thomas Mäder 🇨🇭🇨🇦 🇺🇦@tsmaeder

@GergelyOrosz @cursor_ai I don't know (not my neck of the woods), but maybe I can ask around (or some helpful person shows up here). What info are you looking for?

7h21

Gergely Orosz@GergelyOrosz

@dabit3 @cognition this was the lie in the launch post

integrity matters at startups, and I do not see Cognition having much, never having owed up to this even

http://youtube.com/watch?v=tNmgmwEtoWE

8h4181

VLAD ARBATOV@vladzima

@GergelyOrosz oh sht you're right I didn't even notice

8h7

Yury Molodtsov ⚡️@y_molodtsov

@awakecoding @GergelyOrosz I though it was about optimizing for quality and not cost?

8h5