Retrieving phrases by selecting the history: application to Automatic Speech Recognition

David Langlois 1 Kamel Smaïli 1 Jean-Paul Haton 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This paper focuses on statistical language modeling for automatic speech recognition. We present a method which aims at finding linguistic units in corpus. This method, called the Selected History Principle, consists in finding strong distant relationships between words. The new units are phrases made up of basic units of our vocabulary linked by these distant relationships. We adapt the multigram principle to large vocabularies in order to introduce an optimal subset of these sequences into a bigram model. The bigram model using these sequences outperforms the baseline bigram model by 21% in terms of Perplexity, and increases the recognition rate of the large vocabulary system Sirocco by 8.7%. The word error rate is decreased by 12.7%.
Document type :
Conference papers
Complete list of metadatas

https://hal.inria.fr/inria-00100805
Contributor : Publications Loria <>
Submitted on : Tuesday, September 26, 2006 - 2:51:09 PM
Last modification on : Thursday, January 11, 2018 - 6:19:55 AM

Identifiers

  • HAL Id : inria-00100805, version 1

Collections

Citation

David Langlois, Kamel Smaïli, Jean-Paul Haton. Retrieving phrases by selecting the history: application to Automatic Speech Recognition. 7th International Conference on Spoken Language Processing - ICSLP'2002, Sep 2002, Denver, USA, pp.721. ⟨inria-00100805⟩

Share

Metrics

Record views

174