Self-Organizing Maps capable of encoding structured information will be used for the clustering of XML documents. Documents formatted in XML are appropriately represented as graph data structures. It will be shown that the Self-Organizing Maps can be trained in an unsupervised fashion to group XML structured data into clusters, and that this task is scaled in linear time with increasing size of the corpus. It will also be shown that some simple prior knowledge of the data structures is beneficial to the efficient grouping of the XML documents.

Hagenbuchner, M., Sperduti, A., Tsoi A., C., Trentini, F., Scarselli, F., Gori, M. (2006). Clustering XML documents using self-organizing maps for structures. In Advances in XML Information Retrieval and Evaluation (pp.481-496). Berlin : Springer-Verlag [10.1007/978-3-540-34963-1_37].

Clustering XML documents using self-organizing maps for structures

SCARSELLI, FRANCO;GORI, MARCO
2006-01-01

Abstract

Self-Organizing Maps capable of encoding structured information will be used for the clustering of XML documents. Documents formatted in XML are appropriately represented as graph data structures. It will be shown that the Self-Organizing Maps can be trained in an unsupervised fashion to group XML structured data into clusters, and that this task is scaled in linear time with increasing size of the corpus. It will also be shown that some simple prior knowledge of the data structures is beneficial to the efficient grouping of the XML documents.
2006
9783540349624
9783540349631
Hagenbuchner, M., Sperduti, A., Tsoi A., C., Trentini, F., Scarselli, F., Gori, M. (2006). Clustering XML documents using self-organizing maps for structures. In Advances in XML Information Retrieval and Evaluation (pp.481-496). Berlin : Springer-Verlag [10.1007/978-3-540-34963-1_37].
File in questo prodotto:
File Dimensione Formato  
INEX2005.pdf

non disponibili

Tipologia: PDF editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 823.07 kB
Formato Adobe PDF
823.07 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/5885
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo