Comparison and Analysis of Several Phonetic Decoding Approaches

Luiza Orosanu 1 Denis Jouvet 1
1 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This article analyzes the phonetic decoding performance obtained with different choices of linguistic units. The context is to later use such an approach as a support for helping communication with deaf people, and to run it on an embedded decoder on a portable terminal, which introduces constrains on the model size. As a first step, this paper compares the performance of various approaches on the ESTER2 and ETAPE speech corpora. Two baseline systems are considered, one relying on a large vocabulary speech recognizer, and another one relying on a phonetic n-gram language model. The third model which relies on a syllable-based lexicon and a trigram language model, provides a good tradeoff between model size and phonetic decoding performance. The phone error rate is only 4% worse (absolute) than the phone error rate obtained with the large vocabulary recognizer, and much better than the phone error rate obtained with the phone n-gram language model. Phone error rates are then analyzed with respect to SNR and speaking rate.
Complete list of metadatas

https://hal.inria.fr/hal-00834313
Contributor : Denis Jouvet <>
Submitted on : Friday, March 25, 2016 - 5:35:58 PM
Last modification on : Tuesday, December 18, 2018 - 4:38:02 PM
Long-term archiving on : Sunday, June 26, 2016 - 3:22:40 PM

File

articleTSD2013_#98-Luiza-Orosa...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00834313, version 1

Collections

Citation

Luiza Orosanu, Denis Jouvet. Comparison and Analysis of Several Phonetic Decoding Approaches. TSD - 16th International Conference on Text, Speech and Dialogue - 2013, Sep 2013, Pilsen, Czech Republic. pp.161-168. ⟨hal-00834313⟩

Share

Metrics

Record views

329

Files downloads

246