The Web spam detection problem has received a growing interest in the last few years, since it has a considerable impact on search engine reputations, being fundamental for the increase or the deterioration of the quality of their results. As a matter of fact, the World Wide Web is naturally represented as a graph, where nodes correspond to Web pages and edges stand for hyperlinks. In this paper, we address the Web spam detection problem by using the GNN architecture, a supervised neural network model capable of solving classification and regression problems on graphical domains. Interestingly, a GNN can act as a mixed transductive-inductive model that, during the test phase, is able to classify pages by using both the explicit memory of the classes assigned to the training examples, and the information stored in the network parameters. In this paper, this property of GNNs is evaluated on a well-known benchmark for Web spam detection, the WEBSPAM-UK2006 dataset. The obtained results are comparable to the state-of-the-art on this dataset. Moreover, the experiments show that performances of both the standard and the transductive-inductive GNNs are very similar, whereas the computation time required by the latter is significantly shorter.
Belahcen, A., Bianchini, M., Scarselli, F. (2015). Web spam detection using transductive-inductive Graph Neural Networks. In Advances in Neural Networks: Computational and Theoretical Issues (pp. 83-91). Cham (ZG) : Springer International Publishing Switzerland [10.1007/978-3-319-18164-6_9].
Web spam detection using transductive-inductive Graph Neural Networks
Bianchini, M.;Scarselli, F.
2015-01-01
Abstract
The Web spam detection problem has received a growing interest in the last few years, since it has a considerable impact on search engine reputations, being fundamental for the increase or the deterioration of the quality of their results. As a matter of fact, the World Wide Web is naturally represented as a graph, where nodes correspond to Web pages and edges stand for hyperlinks. In this paper, we address the Web spam detection problem by using the GNN architecture, a supervised neural network model capable of solving classification and regression problems on graphical domains. Interestingly, a GNN can act as a mixed transductive-inductive model that, during the test phase, is able to classify pages by using both the explicit memory of the classes assigned to the training examples, and the information stored in the network parameters. In this paper, this property of GNNs is evaluated on a well-known benchmark for Web spam detection, the WEBSPAM-UK2006 dataset. The obtained results are comparable to the state-of-the-art on this dataset. Moreover, the experiments show that performances of both the standard and the transductive-inductive GNNs are very similar, whereas the computation time required by the latter is significantly shorter.File | Dimensione | Formato | |
---|---|---|---|
WIRN14.pdf
non disponibili
Descrizione: Articolo WIRN 2014
Tipologia:
Post-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
718.17 kB
Formato
Adobe PDF
|
718.17 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/978754