Mixtures of Deep Neural Experts for Automated Speech Scoring

IRIS

The paper copes with the task of automatic assessment of second language proficiency from the language learners' spoken responses to test prompts. The task has significant relevance to the field of computer assisted language learning. The approach presented in the paper relies on two separate modules: (1) an automatic speech recognition system that yields text transcripts of the spoken interactions involved, and (2) a multiple classifier system based on deep learners that ranks the transcripts into proficiency classes. Different deep neural network architectures (both feed-forward and recurrent) are specialized over diverse representations of the texts in terms of: a reference grammar, the outcome of probabilistic language models, several word embeddings, and two bag-of-word models. Combination of the individual classifiers is realized either via a probabilistic pseudo-joint model, or via a neural mixture of experts. Using the data of the third Spoken CALL Shared Task challenge, the highest values to date were obtained in terms of three popular evaluation metrics.

Papi, S., Trentin, E., Gretter, R., Matassoni, M., Falavigna, D. (2020). Mixtures of Deep Neural Experts for Automated Speech Scoring. In Proc. of INTERSPEECH 2020 (pp.3845-3849). Rotterdam : International Speech Communication Association [10.21437/Interspeech.2020-1055].

Mixtures of Deep Neural Experts for Automated Speech Scoring

Papi S.;Trentin E.;Gretter R.;Matassoni M.;Falavigna D.

2020-01-01

Abstract

The paper copes with the task of automatic assessment of second language proficiency from the language learners' spoken responses to test prompts. The task has significant relevance to the field of computer assisted language learning. The approach presented in the paper relies on two separate modules: (1) an automatic speech recognition system that yields text transcripts of the spoken interactions involved, and (2) a multiple classifier system based on deep learners that ranks the transcripts into proficiency classes. Different deep neural network architectures (both feed-forward and recurrent) are specialized over diverse representations of the texts in terms of: a reference grammar, the outcome of probabilistic language models, several word embeddings, and two bag-of-word models. Combination of the individual classifiers is realized either via a probabilistic pseudo-joint model, or via a neural mixture of experts. Using the data of the third Spoken CALL Shared Task challenge, the highest values to date were obtained in terms of three popular evaluation metrics.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Citazione
	
				Papi, S., Trentin, E., Gretter, R., Matassoni, M., Falavigna, D. (2020). Mixtures of Deep Neural Experts for Automated Speech Scoring. In Proc. of INTERSPEECH 2020 (pp.3845-3849). Rotterdam : International Speech Communication Association [10.21437/Interspeech.2020-1055].
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Papi-et-al.pdf accesso aperto Tipologia: PDF editoriale Licenza: PUBBLICO - Pubblico con Copyright Dimensione 212.48 kB Formato Adobe PDF Visualizza/Apri	212.48 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1128575