The paper copes with the task of automatic assessment of second language proficiency from the language learners' spoken responses to test prompts. The task has significant relevance to the field of computer assisted language learning. The approach presented in the paper relies on two separate modules: (1) an automatic speech recognition system that yields text transcripts of the spoken interactions involved, and (2) a multiple classifier system based on deep learners that ranks the transcripts into proficiency classes. Different deep neural network architectures (both feed-forward and recurrent) are specialized over diverse representations of the texts in terms of: a reference grammar, the outcome of probabilistic language models, several word embeddings, and two bag-of-word models. Combination of the individual classifiers is realized either via a probabilistic pseudo-joint model, or via a neural mixture of experts. Using the data of the third Spoken CALL Shared Task challenge, the highest values to date were obtained in terms of three popular evaluation metrics.

Papi, S., Trentin, E., Gretter, R., Matassoni, M., Falavigna, D. (2020). Mixtures of Deep Neural Experts for Automated Speech Scoring. In Proc. of INTERSPEECH 2020 (pp.3845-3849). International Speech Communication Association [10.21437/Interspeech.2020-1055].

Mixtures of Deep Neural Experts for Automated Speech Scoring

Edmondo Trentin;
2020-01-01

Abstract

The paper copes with the task of automatic assessment of second language proficiency from the language learners' spoken responses to test prompts. The task has significant relevance to the field of computer assisted language learning. The approach presented in the paper relies on two separate modules: (1) an automatic speech recognition system that yields text transcripts of the spoken interactions involved, and (2) a multiple classifier system based on deep learners that ranks the transcripts into proficiency classes. Different deep neural network architectures (both feed-forward and recurrent) are specialized over diverse representations of the texts in terms of: a reference grammar, the outcome of probabilistic language models, several word embeddings, and two bag-of-word models. Combination of the individual classifiers is realized either via a probabilistic pseudo-joint model, or via a neural mixture of experts. Using the data of the third Spoken CALL Shared Task challenge, the highest values to date were obtained in terms of three popular evaluation metrics.
2020
Papi, S., Trentin, E., Gretter, R., Matassoni, M., Falavigna, D. (2020). Mixtures of Deep Neural Experts for Automated Speech Scoring. In Proc. of INTERSPEECH 2020 (pp.3845-3849). International Speech Communication Association [10.21437/Interspeech.2020-1055].
File in questo prodotto:
File Dimensione Formato  
Papi-et-al.pdf

accesso aperto

Tipologia: PDF editoriale
Licenza: PUBBLICO - Pubblico con Copyright
Dimensione 212.48 kB
Formato Adobe PDF
212.48 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/1128575