This paper proposes a system for automatically categorizing terms or lexical entities into a predefined set of semantic domains. We present an approach that exploits the knowledge available in the Web to create a model of each term or entity (entity context lexicons - ECLs). Each profile is simply a list of terms (similar to the bag-of-words representation in text categorization) and it is composed primarily by the words often appearing in the same contexts of the entity. These profiles model the contexts in which the entity usually appears and they can be subsequently processed by an automatic classifier. Moreover, we propose and validate a profile-based categorization model developed for this particular task which uses the ECLs of the training entities to build a profile for each class (class context lexicon - CCL). Finally, we propose a technique for dealing with multi-label classification based on a decision module that exploits a neural network. We show the effectiveness of the proposed approach on a term categorization task using a standard benchmark composed of a set of domain-specific lexicons (WordNetDomains).

Rigutini, L., DI IORIO, E., Ernandes, M., Maggini, M. (2006). Semantic labelling of data using the web. In Proceedings of the International Workshop on Technologies and Applications on Knowledge Computing on the Web at the 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006) (pp.638-641) [10.1109/WI-IATW.2006.118].

Semantic labelling of data using the web

RIGUTINI, LEONARDO;DI IORIO, ERNESTO;ERNANDES, MARCO;MAGGINI, MARCO
2006-01-01

Abstract

This paper proposes a system for automatically categorizing terms or lexical entities into a predefined set of semantic domains. We present an approach that exploits the knowledge available in the Web to create a model of each term or entity (entity context lexicons - ECLs). Each profile is simply a list of terms (similar to the bag-of-words representation in text categorization) and it is composed primarily by the words often appearing in the same contexts of the entity. These profiles model the contexts in which the entity usually appears and they can be subsequently processed by an automatic classifier. Moreover, we propose and validate a profile-based categorization model developed for this particular task which uses the ECLs of the training entities to build a profile for each class (class context lexicon - CCL). Finally, we propose a technique for dealing with multi-label classification based on a decision module that exploits a neural network. We show the effectiveness of the proposed approach on a term categorization task using a standard benchmark composed of a set of domain-specific lexicons (WordNetDomains).
2006
0769527493
Rigutini, L., DI IORIO, E., Ernandes, M., Maggini, M. (2006). Semantic labelling of data using the web. In Proceedings of the International Workshop on Technologies and Applications on Knowledge Computing on the Web at the 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006) (pp.638-641) [10.1109/WI-IATW.2006.118].
File in questo prodotto:
File Dimensione Formato  
WI06.pdf

non disponibili

Tipologia: Post-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 534.88 kB
Formato Adobe PDF
534.88 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/37015
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo