Skip to Main content Skip to Navigation
Reports

Evaluating Tree Pattern Similarity for Content-based Routing Systems

Raphaël Chand 1 Pascal Felber 1
1 MASCOTTE - Algorithms, simulation, combinatorics and optimization for telecommunications
CRISAM - Inria Sophia Antipolis - Méditerranée , Laboratoire I3S - COMRED - COMmunications, Réseaux, systèmes Embarqués et Distribués
Abstract : With the advent of XML as the de facto language for data interchange, scalable distribution of data to large populations of consumers remains an important challenge. Content-based publish/subscribe systems offer a convenient abstraction for data producer and consumers, as most of the complexity related to addressing and routing is encapsulated within the network infrastructure. Data consumers typically specify their subscriptions using some XML pattern specification language (e.g., XPath), while producers publish content without prior knowledge of the recipients, if any. A novel approach to content-based routing consists in organizing consumers with similar interests in peer-to-peer semantic communities inside which XML documents are propagated. In order to build semantic communities and connect peers that share common interests with each other, one needs to evaluate the similarity between their subscriptions. In this paper, we specifically address this problem and we propose novel algorithms to compute the similarity of seemingly unrelated tree patterns by taking advantage of information derived from the XML document types, such as valid combinations of elements, or conjunctions and disjunctions on their occurrence. These results are of interest in their own right, and can prove useful in other domains, such as approximate XML queries involving tree patterns. Results from a prototype implementation validate the effectiveness of our approach.
Document type :
Reports
Complete list of metadata

Cited literature [21 references]  Display  Hide  Download

https://hal.inria.fr/inria-00071377
Contributor : Rapport de Recherche Inria <>
Submitted on : Tuesday, May 23, 2006 - 4:57:43 PM
Last modification on : Wednesday, October 14, 2020 - 4:24:17 AM
Long-term archiving on: : Monday, September 17, 2012 - 4:01:41 PM

Identifiers

  • HAL Id : inria-00071377, version 1

Collections

Citation

Raphaël Chand, Pascal Felber. Evaluating Tree Pattern Similarity for Content-based Routing Systems. [Research Report] RR-5891, INRIA. 2006. ⟨inria-00071377⟩

Share