28606 articles – 22093 references  [version française]

hal-00643097, version 1

Learning Twig and Path Queries

Slawomir Staworko () 12, Piotr Wieczorek a3

International Conference on Database Theory (ICDT) (2012)

  • a –  University of Wroclaw
  • 1:  MOSTRARE (INRIA Lille - Nord Europe)

  • INRIA – CNRS : UMR8022 – Université Lille I - Sciences et technologies – Université Lille III - Sciences humaines et sociales France
  • 2:  Laboratoire d'Informatique Fondamentale de Lille (LIFL)
  • http://www.lifl.fr/
    CNRS : UMR8022 – Université Lille I - Sciences et technologies – Université Lille III - Sciences humaines et sociales – INRIA Bâtiment M3 59655 Villeneuve d'Ascq Cédex France
  • 3:  Institute of Computer Science

  • University of Wrocław France

Bibliographic reference

  • Type of document: Peer-reviewed conferences/proceedings
  • Domain: Computer Science/Databases
  • Title: Learning Twig and Path Queries
  • Abstract: We investigate the problem of learning XML queries, \emph{path} queries and \emph{twig} queries, from examples given by the user. A learning algorithm takes on the input a set of XML documents with nodes annotated by the user and returns a query that selects the nodes in a manner consistent with the annotation. We study two learning settings that differ with the types of annotations. In the first setting the user may only indicate \emph{required nodes} that the query must return. In the second, more general, setting, the user may also indicate \emph{forbidden nodes} that the query must not return. The query may or may not return any node with no annotation. We formalize what it means for a class of queries to be \emph{learnable}. One requirement is the existence of a learning algorithm that is \emph{sound} i.e., always returns a query consistent with the examples given by the user. Furthermore, the learning algorithm should be \emph{complete} i.e., able to produce every query with sufficiently rich examples. Other requirements involve tractability of the learning algorithm and its robustness to nonessential examples. We identify practical classes of Boolean and unary, path and twig queries that are learnable from positive examples. We also show that adding negative examples to the picture renders learning unfeasible.
  • Full text language: English
  • Publication date: 2012-03
  • Audience: international
  • Conference title: International Conference on Database Theory (ICDT)
  • Conference city: Berlin
  • Country: Germany
  • Conference date: 2012-03

Attached file list to this document: 

PDF
staworko-icdt12a.pdf(467 KB)
 
  • hal-00643097, version 1
  • oai:hal.inria.fr:hal-00643097
  • From: 
  • Submitted on: Thursday, 5 April 2012 22:48:56
  • Updated on: Friday, 6 April 2012 08:34:12