Retrieving Meaningful Relaxed Tightest Fragments for XML Keyword Search

Abstract : Adapting keyword search to XML data has been attractive recently, generalized as XML keyword search (XKS). One of its key tasks is to return the meaningful fragments as the result. [1] is the latest work following this trend, and it focuses on returning the fragments rooted at SLCA (Smallest LCA – Lowest Common Ancestor) nodes. To guarantee that the fragments only contain interesting nodes, [1] proposes a contributor-based filtering mechanism in its MaxMatch algorithm. However, the filtering mechanism is not sufficient. It will commit the false positive problem (discarding interesting nodes) and the redundancy problem (keeping uninteresting nodes). In this paper, our interest is to propose a framework of retrieving meaningful fragments rooted at not only the SLCA nodes, but all LCA nodes. We begin by introducing the concept of Relaxed Tightest Fragment (RTF) as the basic result type. Then we propose a new filtering mechanism to overcome those two problems in Max-Match. Its kernel is the concept of valid contributor, which helps to distinguish the interesting children of a node. The new filtering mechanism is then to prune the nodes in a RTF which are not valid contributors to their parents. Based on the valid contributor concept, our ValidRTF algorithm not only overcomes those two problems in MaxMatch, but also satisfies the axiomatic properties deduced in [1] that an XKS technique should satisfy. We compare ValidRTF with MaxMatch on real and synthetic XML data. The result verifies our claims, and shows the effectiveness of our valid-contributor-based filtering mechanism.
Type de document :
Communication dans un congrès
EDBT 2009, 2009, Saint Petersburg, Russia. 2009
Liste complète des métadonnées

Littérature citée [29 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00433097
Contributeur : Aurélien Lemay <>
Soumis le : mercredi 18 novembre 2009 - 10:48:23
Dernière modification le : jeudi 11 janvier 2018 - 06:22:13
Document(s) archivé(s) le : jeudi 17 juin 2010 - 20:48:03

Fichier

p0815-KONG.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00433097, version 1

Collections

Citation

Kong Lingbo, Rémi Gilleron, Aurélien Lemay. Retrieving Meaningful Relaxed Tightest Fragments for XML Keyword Search. EDBT 2009, 2009, Saint Petersburg, Russia. 2009. 〈inria-00433097〉

Partager

Métriques

Consultations de la notice

295

Téléchargements de fichiers

177