What are "smart" model routers you know of? Services or vendors that take queries and route the most efficient model they deem, saving cost.
I sense there is a massive demand for these, and will be even more...
Gergely Orosz is asking the community for pointers on services that automatically pick the cheapest suitable model for each AI query, underscoring how teams juggling several providers now face mounting inference bills and want easier ways to switch without rewriting code.
What are "smart" model routers you know of? Services or vendors that take queries and route the most efficient model they deem, saving cost.
I sense there is a massive demand for these, and will be even more...
Factory Router, Not Diamond, Prism, and OpenRouter’s auto mode each claim they can balance cost against quality or latency on the fly, though exact savings and production scale are still emerging.
Orosz and replies note rising interest tied to provider risk and workload growth, yet no public figures show how many teams have switched or how much they have saved so far.
Positive users praise contributors to Not Diamond while negative users accuse Copilot of deliberately selecting expensive models, question Cognition's integrity over misleading Devin claims, and call the vendor landscape a mess.
Solutions I collected so far (no affiliation with either):
- Factory Router - Not Diamond - Prism by Augment Code
AI gateways with routing:
- OpenRouter (auto router) - Kilo Gateway - Requestly - LiteLLM (auto routing)
More pointers welcome!
What are "smart" model routers you know of? Services or vendors that take queries and route the most efficient model they deem, saving cost.
I sense there is a massive demand for these, and will be even more...
@GergelyOrosz @tomas_hk
What are "smart" model routers you know of? Services or vendors that take queries and route the most efficient model they deem, saving cost.
I sense there is a massive demand for these, and will be even more...

@dabit3 @cognition yes but I have trust issues with Devin, given the company lied about Devin's capabilities on launch, never addressed it, apologized or corrected it
just ignoring Cognition till it happens. 2 years and counting I think

@GergelyOrosz @cognition Devin

@GergelyOrosz OpenRouter has one: https://openrouter.ai/openrouter/auto

@GergelyOrosz Cognition has adaptive routing right @dabit3

@GergelyOrosz Morph also have one: https://docs.morphllm.com/sdk/components/router

@GergelyOrosz routellm from lmsys is the open source take, martian and notdiamond sell it as a product. NB: router has to judge difficulty with a cheap model, and it misjudges where routing matters

@tsmaeder @cursor_ai is there an inch of documentation on this? found nothing on their site

@harderthanfire unclear to me if this works only with their models tho. reads like it?

@vladzima routellm has not been updated in 2 years tho?

@GergelyOrosz @cursor_ai https://cursor.com/docs/models-and-pricing

@GergelyOrosz This is their example model selection and you can specify what provider and/or specific models:

@GergelyOrosz not diamond
@tomas_hk is a beast

@GergelyOrosz Thanks for the s/o, have been working on this problem since 2023. For folks interested in learning more, we've written our learnings on smart routing, together with what distinguishes it from gateways and deterministic routing, here: https://www.notdiamond.ai/blog/a-comprehensive-guide-to-model-routing

@tsmaeder @cursor_ai ah so this is flat pricing for models THEY choose. not exactly optimising models you have!

@GergelyOrosz @cursor_ai I don't know (not my neck of the woods), but maybe I can ask around (or some helpful person shows up here). What info are you looking for?

@dabit3 @cognition this was the lie in the launch post
integrity matters at startups, and I do not see Cognition having much, never having owed up to this even
http://youtube.com/watch?v=tNmgmwEtoWE

@GergelyOrosz oh sht you're right I didn't even notice

@awakecoding @GergelyOrosz I though it was about optimizing for quality and not cost?
Gergely Orosz is asking the community for pointers on services that automatically pick the cheapest suitable model for each AI query, underscoring how teams juggling several providers now face mounting inference bills and want easier ways to switch without rewriting code.
What are "smart" model routers you know of? Services or vendors that take queries and route the most efficient model they deem, saving cost.
I sense there is a massive demand for these, and will be even more...