Skip to Main content Skip to Navigation
Conference papers

A language model which integrates uncertainty

Caroline Tambellini 1
1 ADELE [1981-2015] - Environnements et outils pour le Génie Logiciel Industriel [1981-2015]
LIG [2007-2015] - Laboratoire d'Informatique de Grenoble [2007-2015]
Abstract : All information retrieval models are based on the equality which exists between the query terms and the documents terms. This principle is based on the assumption that terms are correctly extracted from the documents and queries. But in some context (such as document generated by automatic speech recognition or by optical character recognition systems), this assumption is not true. We propose a generalisation of the information retrieval model based on the language model integrating this dimension. To do it, we introduce two notions: the term certainty value (in relation with the extraction process) and the pairing between two terms. Pairing between two terms is defined by their relative position (called the concordance) and the area they have in common (called the intersection).
Document type :
Conference papers
Complete list of metadatas

https://hal.inria.fr/hal-00953890
Contributor : Marie-Christine Fauvet <>
Submitted on : Friday, February 28, 2014 - 4:03:18 PM
Last modification on : Friday, July 17, 2020 - 11:10:25 AM

Identifiers

  • HAL Id : hal-00953890, version 1

Collections

CNRS | UGA | LIG

Citation

Caroline Tambellini. A language model which integrates uncertainty. FDIA International Conference, 2007, Glasgow, UK. ⟨hal-00953890⟩

Share

Metrics

Record views

87