This work highlights the importance of the gut microbiota in colorectal health, especially during AP-to-CRC transition. Synthetic data augmentation enlarged and balanced a multidimensional OTU table for machine learning categorization. The OTU table refinement used SVM and LG for sample validation and several statistical tests to assure synthetic data realism. In addition, deep learning feature extraction with LRP identified 64 unique bacterial taxa that were assessed for their ability to distinguish AP and CRC samples in diverse datasets. Fusobacterium was important in LRP and SHAP studies, consistent with its connection with CRC. SHAP analysis using XGBoost discovered differentiated features like Parvimonas, Alistipes, and Ruminococcus in stool and biopsy datasets. Classifying the saliva dataset with 100% accuracy is noteworthy. © 2025 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
Rotelli, A., Iadanza, E. (2025). Advancing Colorectal Cancer Diagnosis: Integrating Synthetic Data and Machine Learning for Microbiome Analysis. In D.G. A. Khamparia (a cura di), Generative Artificial Intelligence for Biomedical and Smart Health Informatics (pp. 135-152). Hoboken : IEEE Press Wiley [10.1002/9781394280735.ch8].
Advancing Colorectal Cancer Diagnosis: Integrating Synthetic Data and Machine Learning for Microbiome Analysis
Ernesto Iadanza
2025-01-01
Abstract
This work highlights the importance of the gut microbiota in colorectal health, especially during AP-to-CRC transition. Synthetic data augmentation enlarged and balanced a multidimensional OTU table for machine learning categorization. The OTU table refinement used SVM and LG for sample validation and several statistical tests to assure synthetic data realism. In addition, deep learning feature extraction with LRP identified 64 unique bacterial taxa that were assessed for their ability to distinguish AP and CRC samples in diverse datasets. Fusobacterium was important in LRP and SHAP studies, consistent with its connection with CRC. SHAP analysis using XGBoost discovered differentiated features like Parvimonas, Alistipes, and Ruminococcus in stool and biopsy datasets. Classifying the saliva dataset with 100% accuracy is noteworthy. © 2025 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.File | Dimensione | Formato | |
---|---|---|---|
ADVANCING_COLORECTAL_CANCER_DIAGNOSIS__INTEGRATING_SYNTHETIC_DATA_AND_MACHINE_LEARNING_FOR_MICROBIOME_ANALYSIS.pdf
non disponibili
Descrizione: Capitolo 8
Tipologia:
Post-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.36 MB
Formato
Adobe PDF
|
1.36 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Advancing Colorectal Cancer Diagnosis-Rotelli-2025.pdf
non disponibili
Descrizione: Capitolo 8
Tipologia:
PDF editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
550.19 kB
Formato
Adobe PDF
|
550.19 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1285554