Background: Sarcopenia is characterized by progressive loss of skeletal muscle mass and strength and is associated with increased disability and mortality. However, the diagnosis of sarcopenia remains challenging due to the absence of a universally accepted gold standard and validated cut-off values for skeletal muscle indices. Data-driven approaches based on unsupervised clustering may overcome these limitations by identifying muscle-related phenotypes directly from anthropometric and body composition data. Methods: In this study, 600 adults with obesity were analyzed and stratified by sex. The dataset was randomly divided into a training set (80%) and a testing set (20%). After data standardization, principal component analysis (PCA) was applied separately in males and females. Unsupervised clustering was then performed on the preserved principal components, and the optimal number of clusters was determined using internal validation indices. Linear Discriminant Analysis (LDA) was applied to assign patients in the test set, and posterior probabilities were correlated with Skeletal Muscle Index (SMI). Results: Clustering consistently identified two distinct groups in both sexes: one with higher SMI and another with lower SMI, consistent with reduced muscle status. Stepwise LDA accurately classified individuals, and posterior probabilities of belonging to the pathological cluster were negatively correlated with SMI in both sexes, despite SMI not being used in clustering or classification. Individuals in the pathological group exhibited significantly lower SMI, particularly among females. Conclusions: The combined use of unsupervised clustering and LDA allows reliable identification of distinct muscle-related phenotypes in adults with obesity. This framework provides reproducible classifications, correlates with skeletal muscle index, and offers a quantitative approach to stratify patients by muscle status, even in the absence of predefined diagnostic criteria. These findings support the potential of data-driven phenotyping to improve early detection of sarcopenic obesity.
Lo Conte, S., Bufano, A., Cevenini, G., Barbini, P., Castagna, M.G., Cartocci, A. (2026). Unsupervised identification of muscle phenotypes in adults with obesity: a data-driven framework for the identification of sarcopenia in absence of a gold standard. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 213 [10.1016/j.ijmedinf.2026.106398].
Unsupervised identification of muscle phenotypes in adults with obesity: a data-driven framework for the identification of sarcopenia in absence of a gold standard
Annalisa, Bufano;Gabriele, Cevenini;Paolo, Barbini;Maria Grazia, Castagna;Alessandra, Cartocci
2026-01-01
Abstract
Background: Sarcopenia is characterized by progressive loss of skeletal muscle mass and strength and is associated with increased disability and mortality. However, the diagnosis of sarcopenia remains challenging due to the absence of a universally accepted gold standard and validated cut-off values for skeletal muscle indices. Data-driven approaches based on unsupervised clustering may overcome these limitations by identifying muscle-related phenotypes directly from anthropometric and body composition data. Methods: In this study, 600 adults with obesity were analyzed and stratified by sex. The dataset was randomly divided into a training set (80%) and a testing set (20%). After data standardization, principal component analysis (PCA) was applied separately in males and females. Unsupervised clustering was then performed on the preserved principal components, and the optimal number of clusters was determined using internal validation indices. Linear Discriminant Analysis (LDA) was applied to assign patients in the test set, and posterior probabilities were correlated with Skeletal Muscle Index (SMI). Results: Clustering consistently identified two distinct groups in both sexes: one with higher SMI and another with lower SMI, consistent with reduced muscle status. Stepwise LDA accurately classified individuals, and posterior probabilities of belonging to the pathological cluster were negatively correlated with SMI in both sexes, despite SMI not being used in clustering or classification. Individuals in the pathological group exhibited significantly lower SMI, particularly among females. Conclusions: The combined use of unsupervised clustering and LDA allows reliable identification of distinct muscle-related phenotypes in adults with obesity. This framework provides reproducible classifications, correlates with skeletal muscle index, and offers a quantitative approach to stratify patients by muscle status, even in the absence of predefined diagnostic criteria. These findings support the potential of data-driven phenotyping to improve early detection of sarcopenic obesity.| File | Dimensione | Formato | |
|---|---|---|---|
|
1-s2.0-S1386505626001383-main (1).pdf
non disponiibile
Descrizione: Articolo
Tipologia:
PDF editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
2.66 MB
Formato
Adobe PDF
|
2.66 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1313116
