We have a new paper published!
We asked a simple question about #LongCovid “phenotypes": if you take the same patients and the same symptoms, but run different clustering algorithms, do you get the same patient subgroups?
Short answer: no. 🧵 1/
We have a new paper published!
We asked a simple question about #LongCovid “phenotypes": if you take the same patients and the same symptoms, but run different clustering algorithms, do you get the same patient subgroups?
Short answer: no. 🧵 1/
No Digg Deeper questions have been answered for this story yet.

Thank you to @the_tessallator for leading & analysis, @rusty_cjm and @leothesaffer for analysis, and @ahandvanish, @leticiasaurus, @ChronicResearch, Alison Cohen, @xuanalogue, and @tessfalor for your work on this paper! 16/
https://academic.oup.com/ooim/article/7/1/iqag010/8707854
Really proud of myself for getting this one over the finish line. Super interesting chance for me to examine clusters robustness is a new-to-me domain, and great to be part of OOI’s special issue on patient-led work
We have a new paper published!
We asked a simple question about #LongCovid “phenotypes": if you take the same patients and the same symptoms, but run different clustering algorithms, do you get the same patient subgroups?
Short answer: no. 🧵 1/

Our asks for future phenotyping work: ✅ report sensitivity to algorithm choice & subsampling, not just internal scores ✅ capture the full breadth of symptoms, including severity and trajectory ✅ integrate biomarkers to define real endotypes, not just symptom boundaries 14/

Takeaways:
1) Symptoms alone are likely not the best way to define #LongCovid phenotypes. We need more phenotyping by biomarkers!
2) Studies using fewer symptoms, fewer patients, or a single clustering method are likely detecting phenotypes that aren't robust or repeatable. 7/

With any clustering, caution is needed to ensure that algorithmically imposed boundaries are not mistakenly interpreted as biologically discrete or clinically stable subtypes. #LongCovid 13/

This paper was led entirely by people with #LongCovid with machine learning backgrounds, with help from people with other IACCs or those taking care of people with Long COVID. 15/

That said, some patterns did recur across all 3 methods:
🔹 A high-burden, multi-systemic cluster appeared every time 🔹 Higher symptom burden tracked with more severe physical & cognitive PEM 🔹 Symptom burden tracked with demographics (see next tweet)
#LongCovid 8/

The data: 6,031 adults with #LongCovid from our patient-led international survey.
Each person reported presence/absence of 162 symptoms across 10 organ systems, plus post-exertional malaise (PEM) severity and demographics. 2/

We ran 3 different unsupervised ML methods on the exact same symptom matrix:
A) autoencoder + HDBSCAN B) ensemble UMAP + k-means consensus C) latent class analysis (LCA)
Then we asked how much they actually agreed. #LongCovid 3/

Even when two methods both found a "high symptom burden" cluster, they didn't contain the same patients.
A patient could land in a high-burden neurocognitive cluster under one algorithm and a generic multi-system cluster under another. 5/

Each method produced clinically plausible clusters — high-burden neurocognitive groups, autonomic groups, pain-dominant groups. They all looked reasonable on their own.
But agreement between methods was low. Pairwise scores ranged from just 0.13 to 0.40. 4/

Consistently across methods: 🔹 Low-burden clusters → higher average age, lower proportion of women 🔹 High-burden clusters → younger, more women, and more severe physical & cognitive PEM
Women were over-represented in high-burden groups throughout. 9/

"Cleaner" clusters, as seen in other phenotyping papers, may just be an artifact of measuring less.
When we dropped symptoms, the optimal cluster count fell. 6/

Two subgroups were reproduced in 2 out of the 3 methods: one with prominent speech & cognitive-linguistic difficulty (B and C), and one reporting PEM but minimal sleep disturbance (A and C). 10/

Our results support using symptom clusters as exploratory tools to generate hypotheses and help communicate about patterns of #LongCovid illness, rather than as rigid labels that define eligibility or predict response to specific treatments. 12/

Bottom line for clinicians & researchers: symptom clusters are useful exploratory and communication tools, not fixed diagnostic types.
Don't treat single-method clusters as biologically discrete subtypes or use them to define trial eligibility without checking robustness. 11/