Skip to Main content Skip to Navigation

Evaluation of XPath Queries on XML Streams with Networks of Early Nested Word Automata

Tom Sebastian 1, 2, 3 
2 LINKS - Linking Dynamic Data
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189
Abstract : The challenge that we tackle in this thesis is the problem of how to answer XPath queries on XML streams with low latency, full coverage, high time efficiency, and low memory costs. We first propose to approximate earliest query answering for navigational XPath queries by compilation to early nested word automata. It turns out that this leads to almost optimal latency and memory consumption. Second, we contribute a formal semantics of XPath 3.0. It is obtained by mapping XPath to the new query language λXP that we introduce. We then show how to compile λXP queries to networks of early nested word automata, and develop streaming algorithms for the latter. Thereby we obtain a streaming algorithm that indeed covers all of XPath 3.0. Third, we develop an algorithm for projecting XML streams with respect to the query defined by an early nested word automaton. Thereby we are able to make our streaming algorithms highly time efficient. We have implemented all our algorithms with the objective to obtain an industrially applicable streaming tool. It turns out that our algorithms outperform all previous approaches in time efficiency, coverage, and latency.
Document type :
Complete list of metadata

Cited literature [60 references]  Display  Hide  Download
Contributor : Tom Sebastian Connect in order to contact the contributor
Submitted on : Wednesday, July 6, 2016 - 11:30:20 AM
Last modification on : Thursday, March 31, 2022 - 4:37:50 AM
Long-term archiving on: : Friday, October 7, 2016 - 11:02:30 AM


  • HAL Id : tel-01342511, version 1


Tom Sebastian. Evaluation of XPath Queries on XML Streams with Networks of Early Nested Word Automata. Databases [cs.DB]. Universite Lille 1, 2016. English. ⟨tel-01342511⟩



Record views


Files downloads