A survey of hybrid ANN/HMM models for automatic speech recognition

Trentin, E.; Gori, M.

doi:10.1016/S0925-2312(00)00308-8

In spite of the advances accomplished throughout the last decades, automatic speech recognition (ASR) is still a challenging and difficult task. In particular, recognition systems based on hidden Markov models (HMMs) are effective under many circumstances, but do suffer from some major limitations that limit applicability of ASR technology in real-world environments. Attempts were made to overcome these limitations with the adoption of artificial neural networks (ANN) as an alternative paradigm for ASR, but ANN were unsuccessful in dealing with long time-sequences of speech signals. Between the end of the 1980s and the beginning of the 1990s, some researchers began exploring a new research area, by combining HMMs and ANNs within a single, hybrid architecture. The goal in hybrid systems for ASR is to take advantage from the properties of both HMMs and ANNs, improving flexibility and recognition performance. A variety of different architectures and novel training algorithms have been proposed in literature. This paper reviews a number of significant hybrid models for ASR, putting together approaches and techniques from a highly specialistic and non-homogeneous literature. Efforts concentrate on describing and referencing architectures and algorithms, their advantages and limitations, as well as on categorizing them into broad classes. Early attempts to emulate HMMs by ANNs are first described. Then we focus on ANNs to estimate posterior probabilities of the states of an HMM and on “global” optimization, where a single, overall training criterion is defined over the HMM and the ANNs. Connectionist vector quantization for discrete HMMs, and other more recent approaches are also reviewed. It is pointed out that, in addition to their theoretical interest, hybrid systems have been allowing for tangible improvements in recognition performance over the standard HMMs in difficult and significant benchmark tasks.

Trentin, E., Gori, M. (2001). A survey of hybrid ANN/HMM models for automatic speech recognition. NEUROCOMPUTING, 37(1-4), 91-126 [10.1016/S0925-2312(00)00308-8].

A survey of hybrid ANN/HMM models for automatic speech recognition

Trentin E.;Gori M.

2001-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2001
			
	Rivista su cui è pubblicata l'opera
	
				NEUROCOMPUTING
			
	Citazione
	
				Trentin, E., Gori, M. (2001). A survey of hybrid ANN/HMM models for automatic speech recognition. NEUROCOMPUTING, 37(1-4), 91-126 [10.1016/S0925-2312(00)00308-8].
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Survey2001.pdf non disponiibile Tipologia: Post-print Licenza: PUBBLICO - Pubblico con Copyright Dimensione 355.06 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	355.06 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/23987

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

A survey of hybrid ANN/HMM models for automatic speech recognition

Trentin E.;Gori M.

2001-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)