Robust combination of neural networks and hidden Markov models for speech recognition

Trentin, Edmondo; Gori, Marco

doi:10.1109/TNN.2003.820838

Acoustic modeling in state-of-the-art speech recognition systems usually relies on hidden Markov models (HMMs) with Gaussian emission densities. HMMs suffer from intrinsic limitations, mainly due to their arbitrary parametric assumption. Artificial neural networks (ANNs) appear to be a promising alternative in this respect, but they historically failed as a general solution to the acoustic modeling problem. This paper introduces algorithms based on a gradient-ascent technique for global training of a hybrid ANN/HMM system, in which the ANN is trained for estimating the emission probabilities of the states of the HMM. The approach is related to the major hybrid systems proposed by Bourlard and Morgan and by Bengio, with the aim of combining their benefits within a unified framework and to overcome their limitations. Several viable solutions to the "divergence problem"-that may arise when training is accomplished over the maximum-likelihood (ML) criterion-are proposed. Experimental results in speaker-independent, continuous speech recognition over Italian digit-strings validate the novel hybrid framework, allowing for improved recognition performance over HMMs with mixtures of Gaussian components, as well as over Bourlard and Morgan's paradigm. In particular, it is shown that the maximum a posteriori (MAP) version of the algorithm yields a 46.34% relative word error rate reduction with respect to standard HMMs.

Trentin, E., Gori, M. (2003). Robust combination of neural networks and hidden Markov models for speech recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS, 14(6), 1519-1531 [10.1109/TNN.2003.820838].

Robust combination of neural networks and hidden Markov models for speech recognition

TRENTIN, EDMONDO;GORI, MARCO

2003-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2003
			
	Rivista su cui è pubblicata l'opera
	
				IEEE TRANSACTIONS ON NEURAL NETWORKS
			
	Citazione
	
				Trentin, E., Gori, M. (2003). Robust combination of neural networks and hidden Markov models for speech recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS, 14(6), 1519-1531 [10.1109/TNN.2003.820838].
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
TrentinGori2003.pdf non disponibili Tipologia: Post-print Licenza: PUBBLICO - Pubblico con Copyright Dimensione 649.28 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	649.28 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/24542

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Robust combination of neural networks and hidden Markov models for speech recognition

TRENTIN, EDMONDO;GORI, MARCO

2003-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)