Integrating Stress Information in Large Vocabulary Continuous Speech Recognition - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Integrating Stress Information in Large Vocabulary Continuous Speech Recognition

Résumé

In this paper we propose a novel method for integrating stress information in the decoding step of a speech recognizer. A multiscale rhythm model was used to determine the stress scores for each syllable, which are further used to reinforce paths during search. Two strategies for integrating the stress were employed: the first one reinforces paths through all the syllables with a value proportional to the their stress score, while the second one enhances paths passing only through stressed syllables, but with a constant value. The former strategy slightly outperforms the later, bringing a relative improvement of more than 2% over the baseline. Furthermore, the stress information proved to be a robust feature, by performing well even for foreign-accented speech.
Fichier principal
Vignette du fichier
IS2012_stress.pdf (250.52 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00758622 , version 1 (29-11-2012)

Identifiants

  • HAL Id : hal-00758622 , version 1

Citer

Bogdan Ludusan, Stefan Ziegler, Guillaume Gravier. Integrating Stress Information in Large Vocabulary Continuous Speech Recognition. INTERSPEECH - Annual Conference of the International Speech Communication Association, 2012, United States. ⟨hal-00758622⟩
204 Consultations
201 Téléchargements

Partager

Gmail Facebook X LinkedIn More