Vectorisation des processus d'appariement document-requête

Vincent Claveau 1, * Romain Tavenard 1 Laurent Amsaleg 1
* Corresponding author
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : In most IR systems, rapidly computing the proximity between a query and a document is an issue. This is generally computed very efficiently in the Vector Space Model. When handling very long queries or with different IR models, however, the cost of this computation can be quite high. In this paper, we propose a simple approach transforming any documentquery pairing technique into a vectorial representation. Therefore, it becomes possible to use existing approximate indexing techniques allowing the fast computation of distances between high-dimensional vectors. We experimentally show that our approach does not degrade the results and can even yields better recall rates when considering high document cut-off values.
Document type :
Conference papers
Complete list of metadatas

https://hal.inria.fr/inria-00561797
Contributor : Patrick Gros <>
Submitted on : Tuesday, February 1, 2011 - 7:40:56 PM
Last modification on : Friday, November 16, 2018 - 1:22:23 AM

Identifiers

  • HAL Id : inria-00561797, version 1

Citation

Vincent Claveau, Romain Tavenard, Laurent Amsaleg. Vectorisation des processus d'appariement document-requête. 7e conférence en recherche d'informations et applications, CORIA'10, Mar 2010, Sousse, Tunisie. ⟨inria-00561797⟩

Share

Metrics

Record views

215