
This is one of the more important results in AI for biology lately.
The finding that performance saturates at ~200k cells (and that simple baselines can match or beat large transformers) suggests *single-cell data* has fundamentally different scaling properties than language or images. The structure is sparser, noisier, and more task-specific.
What stands out is that objective alignment and data quality/curations mattered far more than volume. In drug discovery and perturbation modelling, this feels like a signal to stop treating cell atlases as generic pretraining fuel and instead focus on building objectives that directly reflect the downstream biological questions.
The “bitter lesson” has limits. In some domains, better inductive biases and task framing beat brute-force scale.

