Natural Products (NPs) represent a rich source of bioactive compounds with high structural diversity and therapeutic potential. Automatic classification of NPs is critical to ensure safety, support regulatory compliance, inform product usage, and enable the discovery of new pharmacologically relevant molecules. However, traditional rule-based approaches and hand-crafted molecular fingerprints often fall short in capturing the structural and biosynthetic complexity of NPs, making their classification difficult. In this work, we explore the use of graph neural networks (GNNs) for NP classification by learning neural fingerprints directly from molecular graph structures. GNNs are well-suited for this task, as they can model both the topology and local chemical environments of molecules. We evaluate multiple GNN architectures on curated NP dataset and assess their ability to generalize across hierarchical classification targets. To promote reproducibility, we also examine implementation aspects such as graph construction, node feature selection, architecture tuning, and training strategies. Our results show that GNN-based models consistently outperform classifier based on traditional fingerprints, achieving superior accuracy and robustness. Model performance is strongly influenced by the choice of architecture and feature representation, emphasizing the importance of task-specific model design. These findings highlight the potential of GNNs as effective tools for NP classification. By leveraging graph-based representations, GNNs offer a scalable, data-driven approach that better reflects the structural and functional complexity of natural products. This work provides methodological guidance and encourages broader adoption of topology-aware deep learning algorithms in natural product research and drug discovery. Our code is available at the project page: https://bcorrad.github.io/ginestra25/

Prete, A.L., Corradini, B.T., Costanti, F., Scarselli, F., Bianchini, M. (2025). Leveraging molecular graphs for natural product classification. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 27, 3837-3884 [10.1016/j.csbj.2025.08.031].

Leveraging molecular graphs for natural product classification

Prete, Alessia Lucia
;
Corradini, Barbara Toniella;Costanti, Filippo;Scarselli, Franco;Bianchini, Monica
2025-01-01

Abstract

Natural Products (NPs) represent a rich source of bioactive compounds with high structural diversity and therapeutic potential. Automatic classification of NPs is critical to ensure safety, support regulatory compliance, inform product usage, and enable the discovery of new pharmacologically relevant molecules. However, traditional rule-based approaches and hand-crafted molecular fingerprints often fall short in capturing the structural and biosynthetic complexity of NPs, making their classification difficult. In this work, we explore the use of graph neural networks (GNNs) for NP classification by learning neural fingerprints directly from molecular graph structures. GNNs are well-suited for this task, as they can model both the topology and local chemical environments of molecules. We evaluate multiple GNN architectures on curated NP dataset and assess their ability to generalize across hierarchical classification targets. To promote reproducibility, we also examine implementation aspects such as graph construction, node feature selection, architecture tuning, and training strategies. Our results show that GNN-based models consistently outperform classifier based on traditional fingerprints, achieving superior accuracy and robustness. Model performance is strongly influenced by the choice of architecture and feature representation, emphasizing the importance of task-specific model design. These findings highlight the potential of GNNs as effective tools for NP classification. By leveraging graph-based representations, GNNs offer a scalable, data-driven approach that better reflects the structural and functional complexity of natural products. This work provides methodological guidance and encourages broader adoption of topology-aware deep learning algorithms in natural product research and drug discovery. Our code is available at the project page: https://bcorrad.github.io/ginestra25/
2025
Prete, A.L., Corradini, B.T., Costanti, F., Scarselli, F., Bianchini, M. (2025). Leveraging molecular graphs for natural product classification. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 27, 3837-3884 [10.1016/j.csbj.2025.08.031].
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S2001037025003538-main.pdf

accesso aperto

Tipologia: PDF editoriale
Licenza: Creative commons
Dimensione 5.13 MB
Formato Adobe PDF
5.13 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1299455