/AI19h ago

Nick Dobos, Grimoire creator, warns that mid-conversation model switching causes context loss and double-paying for reprocessed prompts

Prompt caching offers a more reliable 60-80% cost reduction.

123731716.5K

Comments

#1894

Original post

Nick Dobos@NickADobos#1894inAI

Also there’s major continuity issues where swapping models, especially to different providers, OpenAI’s and anthropic’s thinking tokens aren’t compatible and can’t be swapped out.

So swapping models requires removing history. Which seems like a recipe for easy context loss

Nick Dobos@NickADobos

Potentially dumb question, but doesn’t cache busting mean this is actually worse for anything that you can’t one shot?

1st prompt Yay cheap model 2nd prompt Oh wait we need the big model, which means we resubmit your entire chat history

So you pay full price for 2nd prompt, which includes paying for the uncach’ed preceding history, which means you now paid twice for the first prompt, once for cheap model and once for big model

The benchmarks are very misleading here because they only test a single prompt sent. But if you need any sort of follow up you are paying extra

9:32 PM · Jun 2, 2026 · 1.4K Views

/AI19h ago

Nick Dobos, Grimoire creator, warns that mid-conversation model switching causes context loss and double-paying for reprocessed prompts

Prompt caching offers a more reliable 60-80% cost reduction.

--0--

Comments

#1894

Original post

Nick Dobos@NickADobos#1894inAI

Also there’s major continuity issues where swapping models, especially to different providers, OpenAI’s and anthropic’s thinking tokens aren’t compatible and can’t be swapped out.

So swapping models requires removing history. Which seems like a recipe for easy context loss

Nick Dobos@NickADobos

Potentially dumb question, but doesn’t cache busting mean this is actually worse for anything that you can’t one shot?

1st prompt Yay cheap model 2nd prompt Oh wait we need the big model, which means we resubmit your entire chat history

So you pay full price for 2nd prompt, which includes paying for the uncach’ed preceding history, which means you now paid twice for the first prompt, once for cheap model and once for big model

The benchmarks are very misleading here because they only test a single prompt sent. But if you need any sort of follow up you are paying extra

9:32 PM · Jun 2, 2026 · 1.4K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

No ranked X posts are available for this story yet.