We employed a multifaceted computational strategy to identify the genetic factors contributing to increased risk of severe COVID-19 infection from a Whole Exome Sequencing (WES) dataset of a cohort of 2000 Italian patients. We coupled a stratified k-fold screening, to rank variants more associated with severity, with the training of multiple supervised classifiers, to predict severity based on screened features. Feature importance analysis from tree-based models allowed us to identify 16 variants with the highest support which, together with age and gender covariates, were found to be most predictive of COVID-19 severity. When tested on a follow-up cohort, our ensemble of models predicted severity with high accuracy (ACC = 81.88%; AUCROC = 96%; MCC = 61.55%). Our model recapitulated a vast literature of emerging molecular mechanisms and genetic factors linked to COVID-19 response and extends previous landmark Genome-Wide Association Studies (GWAS). It revealed a network of interplaying genetic signatures converging on established immune system and inflammatory processes linked to viral infection response. It also identified additional processes cross-talking with immune pathways, such as GPCR signaling, which might offer additional opportunities for therapeutic intervention and patient stratification. Publicly available PheWAS datasets revealed that several variants were significantly associated with phenotypic traits such as "Respiratory or thoracic disease", supporting their link with COVID-19 severity outcome.A multifaceted computational strategy identifies 16 genetic variants contributing to increased risk of severe COVID-19 infection from a Whole Exome Sequencing dataset of a cohort of Italian patients.
Onoja, A., Picchiotti, N., Fallerini, C., Baldassarri, M., Fava, F., Mari, F., et al. (2022). An explainable model of host genetic interactions linked to COVID-19 severity. COMMUNICATIONS BIOLOGY, 5(1), 1133 [10.1038/s42003-022-04073-6].
An explainable model of host genetic interactions linked to COVID-19 severity
Chiara, FalleriniWriting – Review & Editing
;Margherita, Baldassarri;Francesca, FavaResources
;Francesca MariMembro del Collaboration Group
;Sergio DagaMembro del Collaboration Group
;Elisa BenettiMembro del Collaboration Group
;Mirella BruttiniMembro del Collaboration Group
;Maria PalmieriMembro del Collaboration Group
;Susanna CrociMembro del Collaboration Group
;Sara AmitranoMembro del Collaboration Group
;Ilaria MeloniMembro del Collaboration Group
;Elisa FrullantiMembro del Collaboration Group
;Gabriella DoddatoMembro del Collaboration Group
;Mirjam ListaMembro del Collaboration Group
;Giada BeligniMembro del Collaboration Group
;Floriana ValentinoMembro del Collaboration Group
;Kristina ZguroMembro del Collaboration Group
;Rossella TitaMembro del Collaboration Group
;Annarita GilibertiMembro del Collaboration Group
;Maria Antonietta MencarelliMembro del Collaboration Group
;Caterina Lo RizzoMembro del Collaboration Group
;Anna Maria PintoMembro del Collaboration Group
;Francesca ArianiMembro del Collaboration Group
;Laura Di SarnoMembro del Collaboration Group
;Francesca MontagnaniMembro del Collaboration Group
;Mario TumbarelloMembro del Collaboration Group
;Massimiliano FabbianiMembro del Collaboration Group
;Barbara RossettiMembro del Collaboration Group
;Laura BergantiniMembro del Collaboration Group
;Miriana D'AlessandroMembro del Collaboration Group
;Paolo CameliMembro del Collaboration Group
;David BennettMembro del Collaboration Group
;Federico AneddaMembro del Collaboration Group
;Simona MarcantonioMembro del Collaboration Group
;Sabino ScollettaMembro del Collaboration Group
;Federico FranchiMembro del Collaboration Group
;Maria Antonietta MazzeiMembro del Collaboration Group
;Edoardo ConticiniMembro del Collaboration Group
;Luca CantariniMembro del Collaboration Group
;Bruno FredianiMembro del Collaboration Group
;Danilo TacconiMembro del Collaboration Group
;Chiara Spertilli RaffaelliMembro del Collaboration Group
;Marco FeriMembro del Collaboration Group
;Alice DonatiMembro del Collaboration Group
;Raffaele ScalaMembro del Collaboration Group
;Luca GuidelliMembro del Collaboration Group
;Genni SpargiMembro del Collaboration Group
;Leonardo CrociMembro del Collaboration Group
;Silvia CappelliMembro del Collaboration Group
;Agnese VerzuriMembro del Collaboration Group
;Agostino OgnibeneMembro del Collaboration Group
;Alessandra VergoriMembro del Collaboration Group
;Arianna EmiliozziMembro del Collaboration Group
;Andrea TommasiMembro del Collaboration Group
;Lucia VietriMembro del Collaboration Group
;Francesca GattiMembro del Collaboration Group
;Serafina ValenteMembro del Collaboration Group
;Oreste De VivoMembro del Collaboration Group
;Elena BargagliMembro del Collaboration Group
;Alessia GiorliMembro del Collaboration Group
;Lorenzo SalerniMembro del Collaboration Group
;Enrico MartinelliMembro del Collaboration Group
;Katia CapitaniMembro del Collaboration Group
;Simona DeiMembro del Collaboration Group
;Rosangela ArtusoMembro del Collaboration Group
;Elena AndreucciMembro del Collaboration Group
;Angelica PagliazziMembro del Collaboration Group
;Riccardo ColomboMembro del Collaboration Group
;Sauro LuchiMembro del Collaboration Group
;Paola PetrocelliMembro del Collaboration Group
;Sara ModicaMembro del Collaboration Group
;Silvia BaroniMembro del Collaboration Group
;Marco FalconeMembro del Collaboration Group
;Claudio FerriMembro del Collaboration Group
;Francesco BrancatiMembro del Collaboration Group
;Valentina BorgoMembro del Collaboration Group
;Gabriella Maria SqueoMembro del Collaboration Group
;Alessandra, Renieri
Writing – Review & Editing
;Simone, FuriniFormal Analysis
;
2022-01-01
Abstract
We employed a multifaceted computational strategy to identify the genetic factors contributing to increased risk of severe COVID-19 infection from a Whole Exome Sequencing (WES) dataset of a cohort of 2000 Italian patients. We coupled a stratified k-fold screening, to rank variants more associated with severity, with the training of multiple supervised classifiers, to predict severity based on screened features. Feature importance analysis from tree-based models allowed us to identify 16 variants with the highest support which, together with age and gender covariates, were found to be most predictive of COVID-19 severity. When tested on a follow-up cohort, our ensemble of models predicted severity with high accuracy (ACC = 81.88%; AUCROC = 96%; MCC = 61.55%). Our model recapitulated a vast literature of emerging molecular mechanisms and genetic factors linked to COVID-19 response and extends previous landmark Genome-Wide Association Studies (GWAS). It revealed a network of interplaying genetic signatures converging on established immune system and inflammatory processes linked to viral infection response. It also identified additional processes cross-talking with immune pathways, such as GPCR signaling, which might offer additional opportunities for therapeutic intervention and patient stratification. Publicly available PheWAS datasets revealed that several variants were significantly associated with phenotypic traits such as "Respiratory or thoracic disease", supporting their link with COVID-19 severity outcome.A multifaceted computational strategy identifies 16 genetic variants contributing to increased risk of severe COVID-19 infection from a Whole Exome Sequencing dataset of a cohort of Italian patients.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1223542
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo