This work highlights the importance of the gut microbiota in colorectal health, especially during AP-to-CRC transition. Synthetic data augmentation enlarged and balanced a multidimensional OTU table for machine learning categorization. The OTU table refinement used SVM and LG for sample validation and several statistical tests to assure synthetic data realism. In addition, deep learning feature extraction with LRP identified 64 unique bacterial taxa that were assessed for their ability to distinguish AP and CRC samples in diverse datasets. Fusobacterium was important in LRP and SHAP studies, consistent with its connection with CRC. SHAP analysis using XGBoost discovered differentiated features like Parvimonas, Alistipes, and Ruminococcus in stool and biopsy datasets. Classifying the saliva dataset with 100% accuracy is noteworthy. © 2025 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.

Rotelli, A., Iadanza, E. (2025). Advancing Colorectal Cancer Diagnosis: Integrating Synthetic Data and Machine Learning for Microbiome Analysis. In D.G. A. Khamparia (a cura di), Generative Artificial Intelligence for Biomedical and Smart Health Informatics (pp. 135-152). Hoboken : IEEE Press Wiley [10.1002/9781394280735.ch8].

Advancing Colorectal Cancer Diagnosis: Integrating Synthetic Data and Machine Learning for Microbiome Analysis

Ernesto Iadanza
2025-01-01

Abstract

This work highlights the importance of the gut microbiota in colorectal health, especially during AP-to-CRC transition. Synthetic data augmentation enlarged and balanced a multidimensional OTU table for machine learning categorization. The OTU table refinement used SVM and LG for sample validation and several statistical tests to assure synthetic data realism. In addition, deep learning feature extraction with LRP identified 64 unique bacterial taxa that were assessed for their ability to distinguish AP and CRC samples in diverse datasets. Fusobacterium was important in LRP and SHAP studies, consistent with its connection with CRC. SHAP analysis using XGBoost discovered differentiated features like Parvimonas, Alistipes, and Ruminococcus in stool and biopsy datasets. Classifying the saliva dataset with 100% accuracy is noteworthy. © 2025 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
2025
9781394280704
Rotelli, A., Iadanza, E. (2025). Advancing Colorectal Cancer Diagnosis: Integrating Synthetic Data and Machine Learning for Microbiome Analysis. In D.G. A. Khamparia (a cura di), Generative Artificial Intelligence for Biomedical and Smart Health Informatics (pp. 135-152). Hoboken : IEEE Press Wiley [10.1002/9781394280735.ch8].
File in questo prodotto:
File Dimensione Formato  
ADVANCING_COLORECTAL_CANCER_DIAGNOSIS__INTEGRATING_SYNTHETIC_DATA_AND_MACHINE_LEARNING_FOR_MICROBIOME_ANALYSIS.pdf

non disponibili

Descrizione: Capitolo 8
Tipologia: Post-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.36 MB
Formato Adobe PDF
1.36 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Advancing Colorectal Cancer Diagnosis-Rotelli-2025.pdf

non disponibili

Descrizione: Capitolo 8
Tipologia: PDF editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 550.19 kB
Formato Adobe PDF
550.19 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1285554