This paper concerns the problem of enhancing voice quality for people suffering from dysphonia, which is mainly due to irregular vibration of the vocal folds. A generalized subspace approach (Generalised Singular Value Decomposition, GSVD) is proposed for enhancement of speech corrupted by additive noise, regardless of whether it is white or not. The clean signal is estimated by nulling the signal components in the noise subspace and retaining the components in the signal subspace. Two approaches are compared, taking into account different choices for the noise component. An optimised adaptive comb filter is applied first, to reduce noise between harmonics. Perceptive and objective voice quality measures demonstrate improvements in voice quality when tested with isolated words coming from dysphonic subjects. The method proposed seems promising, as a first step towards fluent speech denoising for people affected by hoarseness. The aim is to provide users (disabled people, as well as clinicians) with a device allowing intelligible and effortless speech, and useful information concerning possible functional recovery. This could be of use to people in social situations where they interact with non-familiar communication partners, such as at work, and in everyday life.
Manfredi, C., Dori, F., Iadanza, E. (2006). Optimised generalised singular value decomposition for dysphonic voice quality enhancement. ACTA ACUSTICA UNITED WITH ACUSTICA, 92, 700-711.
Optimised generalised singular value decomposition for dysphonic voice quality enhancement
E.Iadanza
2006-01-01
Abstract
This paper concerns the problem of enhancing voice quality for people suffering from dysphonia, which is mainly due to irregular vibration of the vocal folds. A generalized subspace approach (Generalised Singular Value Decomposition, GSVD) is proposed for enhancement of speech corrupted by additive noise, regardless of whether it is white or not. The clean signal is estimated by nulling the signal components in the noise subspace and retaining the components in the signal subspace. Two approaches are compared, taking into account different choices for the noise component. An optimised adaptive comb filter is applied first, to reduce noise between harmonics. Perceptive and objective voice quality measures demonstrate improvements in voice quality when tested with isolated words coming from dysphonic subjects. The method proposed seems promising, as a first step towards fluent speech denoising for people affected by hoarseness. The aim is to provide users (disabled people, as well as clinicians) with a device allowing intelligible and effortless speech, and useful information concerning possible functional recovery. This could be of use to people in social situations where they interact with non-familiar communication partners, such as at work, and in everyday life.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/1201115
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo