Skip to Main content Skip to Navigation
Journal articles

Definable relations and first-order query languages over strings

Michael Benedikt 1 Leonid Libkin 2 Thomas Schwentick 3 Luc Segoufin 4 
4 GEMO - Integration of data and knowledge distributed over the web
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : We study analogs of classical relational calculus in the context of strings. We start by studying string logics. Taking a classical model-theoretic approach, we fix a set of string operations and look at the resulting collection of definable relations. These form an algebra-a class of n-ary relations for every n, closed under projection and Boolean operations. We show that by choosing the string vocabulary carefully, we get string logics that have desirable properties: computable evaluation and normal forms. We identify five distinct models and study the differences in their model-theory and complexity of evaluation. We identify a subset of these models which have additional attractive properties, such as finite VC dimension and quantifier elimination. Once you have a logic, the addition of free predicate symbols gives you a string query language. The resulting languages have attractive closure properties from a database point of view: while SQL does not allow the full composition of string pattern-matching expressions with relational operators, these logics yield compositional query languages that can capture common string-matching queries while remaining tractable. For each of the logics studied in the first part of the paper, we study properties of the corresponding query languages. We give bounds on the data complexity of queries, extend the normal form results from logics to queries, and show that the languages have corresponding algebras expressing safe queries.
Document type :
Journal articles
Complete list of metadata

Cited literature [79 references]  Display  Hide  Download
Contributor : Luc Segoufin Connect in order to contact the contributor
Submitted on : Friday, June 5, 2020 - 1:50:00 PM
Last modification on : Sunday, June 26, 2022 - 12:15:59 PM


Files produced by the author(s)




Michael Benedikt, Leonid Libkin, Thomas Schwentick, Luc Segoufin. Definable relations and first-order query languages over strings. Journal of the ACM (JACM), Association for Computing Machinery, 2003, 50 (5), pp.694-751. ⟨10.1145/876638.876642⟩. ⟨hal-02796572⟩



Record views


Files downloads