/AI13h ago

Microsoft Research Unveils Vermeer Model To Predict Protein Localization

651121611.2K
Sandeep Kambhampati@SandeepKambham2

I'm excited to share our new preprint "Vermeer: Autoregressive generative modeling of microscopy predicts protein localization," a collaboration between @MSFTResearch and @insitubiology!

Preprint: https://www.biorxiv.org/content/10.64898/2026.06.01.729395v1

Protein localization is fundamental to protein function, but experimental imaging, a key technique to study localization, cannot scale to the entire human proteome across all biological contexts.

To address this challenge, we introduce Vermeer, an autoregressive generative model trained on Human Protein Atlas data @ProteinAtlas. Vermeer synthesizes fluorescence microscopy images of proteins conditioned on cell morphology reference stains and protein sequence embeddings (ESM-C).

Vermeer can leverage latent information about protein localization in the ESM embeddings to generalize to proteins the model has never seen during training. (1/n)

9:35 AM · Jun 8, 2026 · 11.1K Views
Sentiment

Users are excited about the Vermeer model because it offers a scalable and flexible approach to generative modeling of fluorescent microscopy data for protein localization, and they express gratitude to its contributors.

Pos
100.0%
Neg
0.0%
2 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS80
Sandeep Kambhampati@SandeepKambham2

We evaluate our model by comparing generated images to true held-out data. We test Vermeer across different axes of generalization, including unseen protein/cell-line combinations, unseen cell-lines, and unseen proteins (shown here). (2/n)

13hViews 80Likes 1
LIKES3REPLIES1
Sandeep Kambhampati@SandeepKambham2

Huge thanks to all of my co-authors Eric Zimmermann, Emre Hayir, @KevinKaichuang for all their contributions! Especially grateful to Fei and @alexijielu for being great mentors. (5/n)

13hViews 63Likes 3
Sandeep Kambhampati@SandeepKambham2

To explain this generalization, we show that Vermeer attends to known localization subsequences, such as the Nuclear Localization Signal which mediates protein translocation into the nucleus. (3/n)

13hViews 77Likes 1
Sandeep Kambhampati@SandeepKambham2

We see Vermeer as a highly scalable and flexible approach for generative modeling of fluorescent microscopy data. We're excited to continue developing and applying Vermeer to uncover new biology. Check out the preprint for more details, and our code will be released on Github soon! (4/n)

13hViews 57Likes 1
Sandeep Kambhampati@SandeepKambham2

@broadinstitute @harvardmed @Harvard #AI4Science #MachineLearning #BioImaging

13hViews 59Likes 1