Exploiting Wikipedia Structure for Short Query Expansion in Cultural Heritage

Mohannad Almasri 1 Jean-Pierre Chevallet 1 Catherine Berrut 2
2 MRIM - Modélisation et Recherche d’Information Multimédia [Grenoble]
LIG - Laboratoire d'Informatique de Grenoble, Inria - Institut National de Recherche en Informatique et en Automatique
Abstract : This paper deals with the short and precise queries problem. Short and precise queries do not have sufficient information to be non ambiguous. Pseudo-relevance feedback (PRF) is an effective technique to improve retrieval performance by expanding a user query. However,this collection based expansion method does not work well in the case of short queries. Therefore, we present instead of PRF, a semantic query expansion method based on Wikipedia as external knowledge. We expand short queries by semantically related terms extracted from Wikipedia. We propose and study the effectiveness of three variations for expansion terms selection. We incorporate the expansion terms into the original query and adapt language models to evaluate the expanded queries. Experiments on CLEF cultural heritage corpora show significant improvement in the retrieval performance. We show that the number of expansion terms has an important impact on the precision improvement.
Type de document :
Communication dans un congrès
CORIA, 2014, Nancy, France. 2014
Liste complète des métadonnées

Contributeur : Marie-Christine Fauvet <>
Soumis le : vendredi 28 février 2014 - 10:52:08
Dernière modification le : mercredi 7 novembre 2018 - 13:42:02


  • HAL Id : hal-00953137, version 1



Mohannad Almasri, Jean-Pierre Chevallet, Catherine Berrut. Exploiting Wikipedia Structure for Short Query Expansion in Cultural Heritage. CORIA, 2014, Nancy, France. 2014. 〈hal-00953137〉



Consultations de la notice