In a traditional machine learning task, the goal is training a classifier using only labeled data (data feature/label pairs) in order to be able to generalize on completely new data to be labeled by the classifier. Unluckily in many cases it is difficult, expensive or time consuming to obtain the labeled instances needed for training, also because we usually require a human supervisor to annotate lots of data to collect a significant training set. Moreover, in many cases we are not interested in generalization to any unseen example, but we just require to discover labels for a large quantity of unlabeled, but already available, data by using a small subset of labeled data. If the given scenario involves both these conditions, a semi-supervised learning algorithm can be exploited as a solution for the classification problem. Semi-supervised learning algorithms combine a large amount of unlabeled data and a available small set of labeled data, to build a reliable classifier. It is particularly interesting to focus on a sub-class of semi- supervised learning algorithms, that is graph-based semi-supervised learning. In this framework we represent data as a graph where the nodes represent the la- beled and unlabeled examples in the dataset, and the edges are added according to a given similarity relationship between pairs of examples.
Pucci, A., Gori, M., Maggini, M. (2007). Semi-supervised active learning in graphical domains. In Proceedings of the 5th International Workshop on Mining and Learning with Graphs (MLG07) (pp.187-190).
Semi-supervised active learning in graphical domains
PUCCI, AUGUSTO;GORI, MARCO;MAGGINI, MARCO
2007-01-01
Abstract
In a traditional machine learning task, the goal is training a classifier using only labeled data (data feature/label pairs) in order to be able to generalize on completely new data to be labeled by the classifier. Unluckily in many cases it is difficult, expensive or time consuming to obtain the labeled instances needed for training, also because we usually require a human supervisor to annotate lots of data to collect a significant training set. Moreover, in many cases we are not interested in generalization to any unseen example, but we just require to discover labels for a large quantity of unlabeled, but already available, data by using a small subset of labeled data. If the given scenario involves both these conditions, a semi-supervised learning algorithm can be exploited as a solution for the classification problem. Semi-supervised learning algorithms combine a large amount of unlabeled data and a available small set of labeled data, to build a reliable classifier. It is particularly interesting to focus on a sub-class of semi- supervised learning algorithms, that is graph-based semi-supervised learning. In this framework we represent data as a graph where the nodes represent the la- beled and unlabeled examples in the dataset, and the edges are added according to a given similarity relationship between pairs of examples.File | Dimensione | Formato | |
---|---|---|---|
MLG07.pdf
non disponibili
Tipologia:
Post-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
114.42 kB
Formato
Adobe PDF
|
114.42 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/36393
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo