Baichuan-M4: A Clinical-Grade Medical Agent System for Continuous Care
Achieves 55.1 on HealthBench Professional, beating GPT 5.5
Context: Baichuan is one of the prominent AI startups/labs in China, mostly focusing on AI in healthcare. They've previously released Baichuan-M1 through M3, along with technical reports.
They have now released a technical report for Baichuan-M4, although it is not open-source :(
Baichuan-M4 is designed as a clinical-grade medical agent system, supporting patient consultation, follow-up, continuous care, evidence-based retrieval, medical image understanding, long-term patient memory, and multi-agent coordination in controlled environments.
RL training: "SPAR++ replaces coarse-grained scoring of an entire dialogue trajectory with reward signals anchored to key clinical spans. The model is not only rewarded for reaching the correct final conclusion, but also for sufficient history taking, timely risk identification, and appropriate tool use."
"In mixed initial-visit and follow-up scenarios, M4 uses a curriculum learning strategy [9] of “building the foundation with initial visits first, then improving performance with follow-ups."
Baichuan-M4 is trained with tools for dynamic memory management, retrieval of authoritative medical evidence, and multimodal perception (OCR+X-ray+dermatology).