Acoustic models relying on hidden Markov models (HMMs) are heavily noise-sensitive: recognition performance drops whenever a significant difference in acoustic conditions holds between training and test environments. The relevance of developing acoustic models that are intrinsically robust has to be stressed. Robustness to noise is related to the generalization capabilities of the model. Artificial neural networks (ANNs) appear to be a promising alternative, but they historically failed as a general paradigm for speech recognition. This paper faces the problem by (i) investigating the recognition performance of the ANN/HMM hybrid proposed by the authors over tasks with noisy signals, and (ii) proposing an explicit "soft" weight grouping technique, capable to improve its robustness. Experiments over noisy speaker-independent connected-digits strings are presented. In particular, results on the VODIS II/SpeechDatCar database, collected in a real car environment, show the dramatic gain over the standard HMM, as well as over Bourlard and Morgan's hybrid.
Trentin, E., Gori, M. (2001). Toward noise-tolerant acoustic models. In Proceedings of Eurospeech 2001 (pp.889-892). International Speech Communication Association.
Toward noise-tolerant acoustic models
TRENTIN E.;GORI M.
2001-01-01
Abstract
Acoustic models relying on hidden Markov models (HMMs) are heavily noise-sensitive: recognition performance drops whenever a significant difference in acoustic conditions holds between training and test environments. The relevance of developing acoustic models that are intrinsically robust has to be stressed. Robustness to noise is related to the generalization capabilities of the model. Artificial neural networks (ANNs) appear to be a promising alternative, but they historically failed as a general paradigm for speech recognition. This paper faces the problem by (i) investigating the recognition performance of the ANN/HMM hybrid proposed by the authors over tasks with noisy signals, and (ii) proposing an explicit "soft" weight grouping technique, capable to improve its robustness. Experiments over noisy speaker-independent connected-digits strings are presented. In particular, results on the VODIS II/SpeechDatCar database, collected in a real car environment, show the dramatic gain over the standard HMM, as well as over Bourlard and Morgan's hybrid.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/22996
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo