Sublinear DTD Validity

Antoine Ndione 1 Aurélien Lemay 1 Joachim Niehren 1
1 LINKS - Linking Dynamic Data
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : We present an efficient algorithm for testing approximate dtd validity modulo the strong tree edit distance. Our algorithm inspects xml documents in a probabilistic manner. It detects with high probability the nonvalidity of xml documents with a large fraction of errors, measured in terms of the strong tree edit distance from the dtd. The run time depends polynomially on the depth of the xml document tree but not on its size, so that it is sublinear in most cases (because in practice xml documents tend to be shallow). Therefore, our algorithm can be used to speed up exact dtd validators that run in linear time. We also prove a negative result showing that the run time of any approximate dtd validity tester must depend on the depth of the input tree.

This is a long version of a LATA paper available here.

Type de document :
Autre publication
Long version of a Lata 15 paper. 2015
Liste complète des métadonnées
Contributeur : Inria Links <>
Soumis le : mardi 25 août 2015 - 19:31:57
Dernière modification le : mardi 3 juillet 2018 - 11:32:01


  • HAL Id : hal-01187016, version 1



Antoine Ndione, Aurélien Lemay, Joachim Niehren. Sublinear DTD Validity. Long version of a Lata 15 paper. 2015. 〈hal-01187016〉



Consultations de la notice