Skip to Main content Skip to Navigation
Conference papers

Retrieving Meaningful Relaxed Tightest Fragments for XML Keyword Search

Abstract : Adapting keyword search to XML data has been attractive recently, generalized as XML keyword search (XKS). One of its key tasks is to return the meaningful fragments as the result. [1] is the latest work following this trend, and it focuses on returning the fragments rooted at SLCA (Smallest LCA – Lowest Common Ancestor) nodes. To guarantee that the fragments only contain interesting nodes, [1] proposes a contributor-based filtering mechanism in its MaxMatch algorithm. However, the filtering mechanism is not sufficient. It will commit the false positive problem (discarding interesting nodes) and the redundancy problem (keeping uninteresting nodes). In this paper, our interest is to propose a framework of retrieving meaningful fragments rooted at not only the SLCA nodes, but all LCA nodes. We begin by introducing the concept of Relaxed Tightest Fragment (RTF) as the basic result type. Then we propose a new filtering mechanism to overcome those two problems in Max-Match. Its kernel is the concept of valid contributor, which helps to distinguish the interesting children of a node. The new filtering mechanism is then to prune the nodes in a RTF which are not valid contributors to their parents. Based on the valid contributor concept, our ValidRTF algorithm not only overcomes those two problems in MaxMatch, but also satisfies the axiomatic properties deduced in [1] that an XKS technique should satisfy. We compare ValidRTF with MaxMatch on real and synthetic XML data. The result verifies our claims, and shows the effectiveness of our valid-contributor-based filtering mechanism.
Document type :
Conference papers
Complete list of metadata

Cited literature [29 references]  Display  Hide  Download
Contributor : Aurélien Lemay Connect in order to contact the contributor
Submitted on : Wednesday, November 18, 2009 - 10:48:23 AM
Last modification on : Tuesday, April 28, 2020 - 11:52:03 AM
Long-term archiving on: : Thursday, June 17, 2010 - 8:48:03 PM


Files produced by the author(s)


  • HAL Id : inria-00433097, version 1



Kong Lingbo, Rémi Gilleron, Aurélien Lemay. Retrieving Meaningful Relaxed Tightest Fragments for XML Keyword Search. EDBT 2009, 2009, Saint Petersburg, Russia. ⟨inria-00433097⟩



Record views


Files downloads