This is really good.
OpenAI just moved frontier-level health AI from premium reasoning models into the free GPT-5.5 Instant model.
GPT-5.5 Instant now performs near OpenAI’s Thinking models on health evaluations, meaning the cheaper, faster default model is being trained to behave more like the slower models that spend extra computation checking their reasoning.
The update targets the gap between a chatbot that sounds fluent and a health assistant that knows when to slow down, ask for missing details, admit uncertainty, and push the user toward care when symptoms look urgent.
OpenAI says more than 230 million people ask ChatGPT health and wellness questions every week, so moving this capability into the free product changes the scale from premium assistance to mass access.
From OpenAI's blog looks like they did a huge "distillation" to achieve this. i.e. a stronger teacher model and human experts create high-quality responses, and a cheaper student model learns the answer patterns without repeating the same expensive internal search every time.
i.e. OpenAI's training loop was heavily physician-shaped: more than 260 doctors across 60 countries, 49 languages, and 26 specialties reviewed over 700,000 model responses and judged whether answers were accurate, cautious, clear, complete, and useful.
OpenAI's likely mechanism seems to be a mix of supervised fine-tuning, where Instant is shown better answers, and preference training, where it learns which answer a physician-led rubric prefers when two outputs differ.
The physician part is crucial because the target is not just “medical facts,” but clinical response behavior, such as asking for age, pregnancy status, duration, medication history, severe pain, breathing trouble, fever, neurological symptoms, or other missing context before giving guidance.
So the strongest improvement is not medical trivia but behavior under uncertainty, because a good health answer often means saying what cannot be known yet, what context is missing, what red flags matter, and what the next safe step should be.
OpenAI also reports 71% fewer flagged factuality issues in real health traffic over two months, which suggests the update is reducing wrong claims in everyday use rather than only improving benchmark scores.