Skip to Main content Skip to Navigation
New interface
Journal articles

Modelling Semantic Context of OOV Words in Large Vocabulary Continuous Speech Recognition

Imran Ahamad Sheikh 1 Dominique Fohr 1 Irina Illina 1 Georges Linares 2 
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : The diachronic nature of broadcast news data leads to the problem of Out-Of-Vocabulary (OOV) words in Large Vocabulary Continuous Speech Recognition (LVCSR) systems. Analysis of OOV words reveals that a majority of them are Proper Names (PNs). However PNs are important for automatic indexing of audio-video content and for obtaining reliable automatic transcriptions. In this paper, we focus on the problem of OOV PNs in diachronic audio documents. To enable recovery of the PNs missed by the LVCSR system, relevant OOV PNs are retrieved by exploiting the semantic context of the LVCSR transcriptions. For retrieval of OOV PNs, we explore topic and semantic context derived from Latent Dirichlet Allocation (LDA) topic models, continuous word vector representations and the Neural Bag-of-Words (NBOW) model which is capable of learning task specific word and context representations. We propose a Neural Bag-of-Weighted Words (NBOW2) model which learns to assign higher weights to words that are important for retrieval of an OOV PN. With experiments on French broadcast news videos we show that the NBOW and NBOW2 models outperform the methods based on raw embeddings from LDA and Skip-gram models. Combining the NBOW and NBOW2 models gives a faster convergence during training. Second pass speech recognition experiments, in which the LVCSR vocabulary and language model are updated with the retrieved OOV PNs, demonstrate the effectiveness of the proposed context models.
Document type :
Journal articles
Complete list of metadata

Cited literature [76 references]  Display  Hide  Download
Contributor : Imran Sheikh Connect in order to contact the contributor
Submitted on : Wednesday, February 8, 2017 - 12:11:23 PM
Last modification on : Friday, July 8, 2022 - 10:06:39 AM
Long-term archiving on: : Tuesday, May 9, 2017 - 1:11:03 PM


Files produced by the author(s)



Imran Ahamad Sheikh, Dominique Fohr, Irina Illina, Georges Linares. Modelling Semantic Context of OOV Words in Large Vocabulary Continuous Speech Recognition. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2017, 25 (3), pp.598 - 610. ⟨10.1109/TASLP.2017.2651361⟩. ⟨hal-01461617⟩



Record views


Files downloads