Why don’t LLM’s just tell you when you are asking a question / doing something that is out of distribution?
Logan Kilpatrick questions why large language models fail to notify users when inputs fall outside their training distribution
AI Judge changed title after evaluation, original title: "Logan Kilpatrick asks why large language models do not indicate to users when queries fall outside their training distribution and produce unflagged responses"
Replies noted detection risks and coding agent breakdowns on novel tasks.
Positive users praise the technologist's question on LLMs failing to flag out-of-distribution inputs and find built-in uncertainty flags highly useful, while negative users call the models hallucinatory or programmed to lie.

What are out of distribution inputs?
Out-of-distribution inputs are user queries or tasks that fall outside an LLM’s training data distribution. Current models lack built-in signals to flag them, often producing unreliable or hallucinatory responses instead. Outlier detection could help but would incorrectly error on most everyday requests.
Most Activity
@OfficialLoganK Was just thinking something similar as I was writing this https://x.com/omarsar0/status/2056392467604205852?s=20. "It doesn't know when it doesn't know" is a classical weakness for which no good solutions exist. Autoregressive nature of it, I guess (lazy answer).
Every time I ask my 10-year-old to use coding agents, he gets extremely disappointed.
It turns out that all he wants is to build his own rocket simulator.
No amount of context engineering helps. No model works. All coding agents fail.
That's just one example. He has many use cases where the coding agent really suck. Learning apps and other types of science-centered simulators.
It's not like he is trying to be adversarial or break the system. I use the coding agents so extensively in my codebases that I just assumed that he would get similar results. It's not the case. And I think this is happening across all kinds of domains.
I know he is not the target user. I get all that. But if all these claims about superintelligent AI on the horizon (12-18 months) are right, then coding agents shouldn't struggle so much to build any of the things he wants.
The reality is that coding agents can help maintain and build complex things that aim to extend what exists in abundance in the training data. No surprises there. There is plenty of AI research to explain the OOD issues with LLMs.
I think there is a massive opportunity here. Potentially a more generalized harness (something I have been working on). It doesn't have to work super well now, but it tests on edge use cases as newer models and capabilities emerge.
IMO, all of this is a good indicator that LLMs are nowhere close to AGI or whatever they call it these days. Every day that passes, I am more convinced that we need to quickly move beyond LLMs and into things like native multi-modal systems and world models.
@OfficialLoganK @JagersbergKnut Erroring on 99% of my requests would suck
Why don’t LLM’s just tell you when you are asking a question / doing something that is out of distribution?

@OfficialLoganK Because that would usually require consciousness and you’re just interacting with a token prediction machine that occasionally mimes those patterns
@OfficialLoganK "There is no distribution." - @roydanroy
Why don’t LLM’s just tell you when you are asking a question / doing something that is out of distribution?
@OfficialLoganK Outlayer detection could come in handy there. I know a guy.
Why don’t LLM’s just tell you when you are asking a question / doing something that is out of distribution?

@OfficialLoganK i built this specifically!! https://github.com/peytontolbert/ConfidenceTransformer

@OfficialLoganK Confidence calibration is the missing bridge between LLM outputs and actual reliability.

@OfficialLoganK with the right tools and environment - the LLM itself is the distribution! ^^

@OfficialLoganK Because models are trained to be helpful, and 'I'm not sure this is in my wheelhouse' reads as unhelpful. But you're right — epistemic honesty about distribution boundaries would make outputs far more trustworthy. It's a training objective tradeoff.

@OfficialLoganK it's just the bitter lesson in action. we train them with pure compute for next-token prediction rather than hardcoding confidence rules. the model literally doesn't know what it doesn't know.

@OfficialLoganK Because they don't know they're out of distribution. The same process that generates a confident wrong answer generates a confident right one. There's no separate uncertainty layer, just next-token prediction all the way down.

@OfficialLoganK LLMs are trained to be helpful, not fact-checkers—so they’ll often roll with ambiguous prompts to avoid killing the vibe. Also, defining ‘out of distribution’ for something as fluid as human curiosity is like herding cats wearing roller skates. 😅

@OfficialLoganK @grok how would they know whether it’s out of distribution or not? also, wouldn’t it be scalar not binary?

@OfficialLoganK that would be a nice capability Logan
an leading AI lab should do that

They actually can to a degree. You just have to ask it for its self-confidence or uncertainty. In fact, all forms of LLM tools such as search, calculation, generation/thinking, etc. can be viewed as uncertainty resolvers.
As shared, they can to a degree only but not more because they have no experience. They can only infer via uncertainty of concepts but not through actual experience.
That's the basis of my custom prompt below:
https://gist.github.com/CurtisAccelerate/e158922548b1cfe594fe2a8eecf941ac

@OfficialLoganK Wouldn't the ability to do so mean the response is in distribution, and so to be reached from anywhere out of distribution requires strong attractors? Unless you mean a way extrinsic to the model.

@OfficialLoganK Also, why wouldn't LLMs just predict that they are about to decode a mistake and decode something else instead

@OfficialLoganK because they don't know what they don't know, that's kind of the whole problem

Because “out of distribution” is not the same as “I don’t know.”
A model can be unfamiliar with the exact fact pattern but still reason from supplied context.
It can also be in-distribution and confidently wrong.
The useful feature is calibrated uncertainty with a recovery path.

@OfficialLoganK Yes, hallucination is really a headache. 😿