The classification of graphical patterns (i.e., data that are represented in the form of labeled graphs) is a problem that has been receiving considerable attention by the machine learning community in recent years. Solutions to the problem would be valuable to a number of applications, ranging from bioinformatics and cheminformatics to Web-related tasks, structural pattern recognition for image processing, etc. Several approaches have been proposed so far, e.g. inductive logic programming and kernels for graphs. Connectionist models were introduced too, namely recursive neural nets (RNN) and graph neural nets (GNN). Although their theoretical properties are sound and thoroughly understood, RNNs and GNNs suffer some drawbacks that may limit their application. This paper introduces an alternative connectionist framework for learning discriminant functions over graphical data. The approach is simple and suitable to maximum-a-posteriori classification of broad families of graphs, and overcomes some limitations of RNNs and GNNs. The idea is to describe a graph as an algebraic relation, i.e. as a subset of the Cartesian product. The class-posterior probabilities given the relation are then reduced (under an iid assumption) to products of probabilistic quantities, estimated using a multilayer perceptron. Empirical evidence shows that, in spite of its simplicity, the technique compares favorably with established approaches on several tasks involving different graphical representations of the data. In particular, in the classification of molecules from the Mutagenesis dataset (friendly+unfriendly) the best result to date (93.91%) is obtained.

Trentin, E., ERNESTO DI, I. (2009). Classification of graphical data made easy. NEUROCOMPUTING, 73(1-3), 204-212 [10.1016/j.neucom.2008.07.021].

Classification of graphical data made easy

TRENTIN, EDMONDO;
2009-01-01

Abstract

The classification of graphical patterns (i.e., data that are represented in the form of labeled graphs) is a problem that has been receiving considerable attention by the machine learning community in recent years. Solutions to the problem would be valuable to a number of applications, ranging from bioinformatics and cheminformatics to Web-related tasks, structural pattern recognition for image processing, etc. Several approaches have been proposed so far, e.g. inductive logic programming and kernels for graphs. Connectionist models were introduced too, namely recursive neural nets (RNN) and graph neural nets (GNN). Although their theoretical properties are sound and thoroughly understood, RNNs and GNNs suffer some drawbacks that may limit their application. This paper introduces an alternative connectionist framework for learning discriminant functions over graphical data. The approach is simple and suitable to maximum-a-posteriori classification of broad families of graphs, and overcomes some limitations of RNNs and GNNs. The idea is to describe a graph as an algebraic relation, i.e. as a subset of the Cartesian product. The class-posterior probabilities given the relation are then reduced (under an iid assumption) to products of probabilistic quantities, estimated using a multilayer perceptron. Empirical evidence shows that, in spite of its simplicity, the technique compares favorably with established approaches on several tasks involving different graphical representations of the data. In particular, in the classification of molecules from the Mutagenesis dataset (friendly+unfriendly) the best result to date (93.91%) is obtained.
2009
Trentin, E., ERNESTO DI, I. (2009). Classification of graphical data made easy. NEUROCOMPUTING, 73(1-3), 204-212 [10.1016/j.neucom.2008.07.021].
File in questo prodotto:
File Dimensione Formato  
TrentinDiIorio_Neucom2009.pdf

non disponibili

Tipologia: Post-print
Licenza: PUBBLICO - Pubblico con Copyright
Dimensione 218.13 kB
Formato Adobe PDF
218.13 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/24247
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo