HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Habilitation à diriger des recherches

Symbolic Inference Methods for Databases

Slawomir Staworko 1, 2
1 LINKS - Linking Dynamic Data
CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189, Inria Lille - Nord Europe
Abstract : This dissertation is a summary of a line of research, that I was actively involved in, on learning in databases from examples. This research focused on traditional as well as novel database models and languages for querying, transforming, and describing the schema of a database. In case of schemas our contributions involve proposing an original languages for the emerging data models of Unordered XML and RDF. We have studied learning from examples of schemas for Unordered XML, schemas for RDF, twig queries for XML, join queries for relational databases, and XML transformations defined with a novel model of tree-to-word transducers. Investigating learnability of the proposed languages required us to examine closely a number of their fundamental properties, often of independent interest, including normal forms, minimization, containment and equivalence, consistency of a set of examples, and finite characterizability. Good understanding of these properties allowed us to devise learning algorithms that explore a possibly large search space with the help of a diligently designed set of generalization operations in search of an appropriate solution. Learning (or inference) is a problem that has two parameters: the precise class of languages we wish to infer and the type of input that the user can provide. We focused on the setting where the user input consists of positive examples i.e., elements that belong to the goal language, and negative examples i.e., elements that do not belong to the goal language. In general using both negative and positive examples allows to learn richer classes of goal languages than using positive examples alone. However, using negative examples is often difficult because together with positive examples they may cause the search space to take a very complex shape and its exploration may turn out to be computationally challenging.
Document type :
Habilitation à diriger des recherches
Complete list of metadata

Cited literature [103 references]  Display  Hide  Download

Contributor : Slawomir Staworko Connect in order to contact the contributor
Submitted on : Wednesday, January 13, 2016 - 1:13:11 AM
Last modification on : Wednesday, March 23, 2022 - 3:51:21 PM
Long-term archiving on: : Friday, November 11, 2016 - 3:34:51 AM


  • HAL Id : tel-01254965, version 1


Slawomir Staworko. Symbolic Inference Methods for Databases. Databases [cs.DB]. Université de Lille, 2015. ⟨tel-01254965⟩



Record views


Files downloads