I'm excited to share our new preprint "Vermeer: Autoregressive generative modeling of microscopy predicts protein localization," a collaboration between @MSFTResearch and @insitubiology!
Preprint: https://www.biorxiv.org/content/10.64898/2026.06.01.729395v1
Protein localization is fundamental to protein function, but experimental imaging, a key technique to study localization, cannot scale to the entire human proteome across all biological contexts.
To address this challenge, we introduce Vermeer, an autoregressive generative model trained on Human Protein Atlas data @ProteinAtlas. Vermeer synthesizes fluorescence microscopy images of proteins conditioned on cell morphology reference stains and protein sequence embeddings (ESM-C).
Vermeer can leverage latent information about protein localization in the ESM embeddings to generalize to proteins the model has never seen during training. (1/n)
