Comparison and Analysis of Several Phonetic Decoding Approaches

Luiza Orosanu 1 Denis Jouvet 1
1 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This article analyzes the phonetic decoding performance obtained with different choices of linguistic units. The context is to later use such an approach as a support for helping communication with deaf people, and to run it on an embedded decoder on a portable terminal, which introduces constrains on the model size. As a first step, this paper compares the performance of various approaches on the ESTER2 and ETAPE speech corpora. Two baseline systems are considered, one relying on a large vocabulary speech recognizer, and another one relying on a phonetic n-gram language model. The third model which relies on a syllable-based lexicon and a trigram language model, provides a good tradeoff between model size and phonetic decoding performance. The phone error rate is only 4% worse (absolute) than the phone error rate obtained with the large vocabulary recognizer, and much better than the phone error rate obtained with the phone n-gram language model. Phone error rates are then analyzed with respect to SNR and speaking rate.
Type de document :
Communication dans un congrès
Ivan Habernal and Václav Matoušek. TSD - 16th International Conference on Text, Speech and Dialogue - 2013, Sep 2013, Pilsen, Czech Republic. Springer Verlag, 8082, pp.161-168, 2013, Lecture Notes in Artificial Intelligence. 〈http://link.springer.com/chapter/10.1007%2F978-3-642-40585-3_21〉
Liste complète des métadonnées

https://hal.inria.fr/hal-00834313
Contributeur : Denis Jouvet <>
Soumis le : vendredi 25 mars 2016 - 17:35:58
Dernière modification le : jeudi 11 janvier 2018 - 06:25:24
Document(s) archivé(s) le : dimanche 26 juin 2016 - 15:22:40

Fichier

articleTSD2013_#98-Luiza-Orosa...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00834313, version 1

Collections

Citation

Luiza Orosanu, Denis Jouvet. Comparison and Analysis of Several Phonetic Decoding Approaches. Ivan Habernal and Václav Matoušek. TSD - 16th International Conference on Text, Speech and Dialogue - 2013, Sep 2013, Pilsen, Czech Republic. Springer Verlag, 8082, pp.161-168, 2013, Lecture Notes in Artificial Intelligence. 〈http://link.springer.com/chapter/10.1007%2F978-3-642-40585-3_21〉. 〈hal-00834313〉

Partager

Métriques

Consultations de la notice

288

Téléchargements de fichiers

102