The classification of graphical patterns (i.e., data that are represented in the form of labeled graphs) is a problem that has been receiving considerable attention by the machine learning community in recent years. Solutions to the problem would be valuable to a number of applications, ranging from bioinformatics and cheminformatics to Web-related tasks, structural pattern recognition for image processing, etc. Several approaches have been proposed so far, e.g. inductive logic programming and kernels for graphs. Connectionist models were introduced too, namely recursive neural nets (RNN) and graph neural nets (GNN). Although their theoretical properties are sound and thoroughly understood, RNNs and GNNs suffer some drawbacks that may limit their application. This paper introduces an alternative connectionist framework for learning discriminant functions over graphical data. The approach is simple and suitable to maximum-a-posteriori classification of broad families of graphs, and overcomes some limitations of RNNs and GNNs. The idea is to describe a graph as an algebraic relation, i.e. as a subset of the Cartesian product. The class-posterior probabilities given the relation are then reduced (under an iid assumption) to products of probabilistic quantities, estimated using a multilayer perceptron. Empirical evidence shows that, in spite of its simplicity, the technique compares favorably with established approaches on several tasks involving different graphical representations of the data. In particular, in the classification of molecules from the Mutagenesis dataset (friendly+unfriendly) the best result to date (93.91%) is obtained.
Trentin, E., ERNESTO DI, I. (2009). Classification of graphical data made easy. NEUROCOMPUTING, 73(1-3), 204-212 [10.1016/j.neucom.2008.07.021].
Classification of graphical data made easy
TRENTIN, EDMONDO;
2009-01-01
Abstract
The classification of graphical patterns (i.e., data that are represented in the form of labeled graphs) is a problem that has been receiving considerable attention by the machine learning community in recent years. Solutions to the problem would be valuable to a number of applications, ranging from bioinformatics and cheminformatics to Web-related tasks, structural pattern recognition for image processing, etc. Several approaches have been proposed so far, e.g. inductive logic programming and kernels for graphs. Connectionist models were introduced too, namely recursive neural nets (RNN) and graph neural nets (GNN). Although their theoretical properties are sound and thoroughly understood, RNNs and GNNs suffer some drawbacks that may limit their application. This paper introduces an alternative connectionist framework for learning discriminant functions over graphical data. The approach is simple and suitable to maximum-a-posteriori classification of broad families of graphs, and overcomes some limitations of RNNs and GNNs. The idea is to describe a graph as an algebraic relation, i.e. as a subset of the Cartesian product. The class-posterior probabilities given the relation are then reduced (under an iid assumption) to products of probabilistic quantities, estimated using a multilayer perceptron. Empirical evidence shows that, in spite of its simplicity, the technique compares favorably with established approaches on several tasks involving different graphical representations of the data. In particular, in the classification of molecules from the Mutagenesis dataset (friendly+unfriendly) the best result to date (93.91%) is obtained.File | Dimensione | Formato | |
---|---|---|---|
TrentinDiIorio_Neucom2009.pdf
non disponibili
Tipologia:
Post-print
Licenza:
PUBBLICO - Pubblico con Copyright
Dimensione
218.13 kB
Formato
Adobe PDF
|
218.13 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/24247
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo