Real-world applications of pattern recognition, or machine learning algorithms, often present situations where the data are partly missing, corrupted by noise, or otherwise incomplete. In spite of that, developments in the machine learning community in the last decade have mostly focused on mathematical analysis of learning machines, making it difficult for practitioners to recollect an overview of major approaches to this issue. Paradoxically, as a consequence, even established methodologies rooted in statistics appear to have long been forgotten. Although the relevant literature on the topic is so wide that no exhaustive coverage is nowadays possible, the first goal of this paper is to provide the reader with a nonetheless significant survey of major, or utterly sound, techniques for dealing with the tasks of pattern recognition, machine learning, and density estimation from incomplete data. Secondly, the paper aims at representing a viable tutorial tool for the interested practitioner, by allowing for self-contained, step-by-step understanding of several approaches. An effort is made to categorize the different techniques as follows: (1) heuristic methods; (2) statistical approaches; (3) connectionist-oriented techniques; (4) other approaches (dynamical systems, adversarial deletion of features, etc.).
Aste, M., Boninsegna, M., Freno, A., Trentin, E. (2015). Techniques for dealing with incomplete data: a tutorial and survey. PATTERN ANALYSIS AND APPLICATIONS, 18(1), 1-29 [10.1007/s10044-014-0411-9].
Techniques for dealing with incomplete data: a tutorial and survey
Freno, Antonino;Trentin, Edmondo
2015-01-01
Abstract
Real-world applications of pattern recognition, or machine learning algorithms, often present situations where the data are partly missing, corrupted by noise, or otherwise incomplete. In spite of that, developments in the machine learning community in the last decade have mostly focused on mathematical analysis of learning machines, making it difficult for practitioners to recollect an overview of major approaches to this issue. Paradoxically, as a consequence, even established methodologies rooted in statistics appear to have long been forgotten. Although the relevant literature on the topic is so wide that no exhaustive coverage is nowadays possible, the first goal of this paper is to provide the reader with a nonetheless significant survey of major, or utterly sound, techniques for dealing with the tasks of pattern recognition, machine learning, and density estimation from incomplete data. Secondly, the paper aims at representing a viable tutorial tool for the interested practitioner, by allowing for self-contained, step-by-step understanding of several approaches. An effort is made to categorize the different techniques as follows: (1) heuristic methods; (2) statistical approaches; (3) connectionist-oriented techniques; (4) other approaches (dynamical systems, adversarial deletion of features, etc.).File | Dimensione | Formato | |
---|---|---|---|
17-AsteBoninsegnaFrenoTrentin.pdf
non disponibili
Tipologia:
PDF editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
527.28 kB
Formato
Adobe PDF
|
527.28 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/49321