XML is becoming increasingly popular as a language for representing many types of electronic documents. The consequence of the strict structural document description via XML is that a relatively new task in mining documents based on structural and/or content information has emerged. In this paper we investigate (1) the suitability of new unsupervised machine learning methods for the clustering task of XML documents, and (2) the importance of contextual information for the same task. These tasks are part of an international competition on XML clustering and categorization (INEX 2006). It will be shown that the proposed approaches provide a suitable tool for the clustering of structured data as they yield the best results in the international INEX 2006 competition on clustering of XML data.

Kc, M., Hagenbuchner, M., Tsoi, A.C., Scarselli, F., Sperduti, A., Gori, M. (2006). XML document mining using contextual self-organizing maps for structures. In Comparative Evaluation of XML Information Retrieval Systems, (pp.510-524). Berlin Heidelberg : Springer [10.1007/978-3-540-73888-6_47].

XML document mining using contextual self-organizing maps for structures

SCARSELLI, F.;GORI, M.
2006-01-01

Abstract

XML is becoming increasingly popular as a language for representing many types of electronic documents. The consequence of the strict structural document description via XML is that a relatively new task in mining documents based on structural and/or content information has emerged. In this paper we investigate (1) the suitability of new unsupervised machine learning methods for the clustering task of XML documents, and (2) the importance of contextual information for the same task. These tasks are part of an international competition on XML clustering and categorization (INEX 2006). It will be shown that the proposed approaches provide a suitable tool for the clustering of structured data as they yield the best results in the international INEX 2006 competition on clustering of XML data.
2006
9783540738879
9783540738886
Kc, M., Hagenbuchner, M., Tsoi, A.C., Scarselli, F., Sperduti, A., Gori, M. (2006). XML document mining using contextual self-organizing maps for structures. In Comparative Evaluation of XML Information Retrieval Systems, (pp.510-524). Berlin Heidelberg : Springer [10.1007/978-3-540-73888-6_47].
File in questo prodotto:
File Dimensione Formato  
INEX2006 SOM.pdf

non disponibili

Tipologia: PDF editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 9.53 MB
Formato Adobe PDF
9.53 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/42976
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo