Exploiting Wikipedia Structure for Short Query Expansion in Cultural Heritage

Mohannad Almasri 1 Jean-Pierre Chevallet 1 Catherine Berrut 2
2 MRIM - Modélisation et Recherche d’Information Multimédia [Grenoble]
LIG - Laboratoire d'Informatique de Grenoble, Inria - Institut National de Recherche en Informatique et en Automatique
Abstract : This paper deals with the short and precise queries problem. Short and precise queries do not have sufficient information to be non ambiguous. Pseudo-relevance feedback (PRF) is an effective technique to improve retrieval performance by expanding a user query. However,this collection based expansion method does not work well in the case of short queries. Therefore, we present instead of PRF, a semantic query expansion method based on Wikipedia as external knowledge. We expand short queries by semantically related terms extracted from Wikipedia. We propose and study the effectiveness of three variations for expansion terms selection. We incorporate the expansion terms into the original query and adapt language models to evaluate the expanded queries. Experiments on CLEF cultural heritage corpora show significant improvement in the retrieval performance. We show that the number of expansion terms has an important impact on the precision improvement.
Type de document :
Communication dans un congrès
CORIA, 2014, Nancy, France. 2014
Liste complète des métadonnées

https://hal.inria.fr/hal-00953137
Contributeur : Marie-Christine Fauvet <>
Soumis le : vendredi 28 février 2014 - 10:52:08
Dernière modification le : jeudi 11 janvier 2018 - 06:22:06

Identifiants

  • HAL Id : hal-00953137, version 1

Collections

Citation

Mohannad Almasri, Jean-Pierre Chevallet, Catherine Berrut. Exploiting Wikipedia Structure for Short Query Expansion in Cultural Heritage. CORIA, 2014, Nancy, France. 2014. 〈hal-00953137〉

Partager

Métriques

Consultations de la notice

275