Node Similarity with q -Grams for Real-World Labeled Networks

Abstract : We study node similarity in labeled networks, using the label sequences found in paths of bounded length q leading to the nodes. (This recalls the q-grams employed in document resemblance, based on the Jaccard distance.) When applied to networks, the challenge is twofold: the number of q-grams generated from labeled paths grows exponentially with q, and their frequency should be taken into account: this leads to a variation of the Jaccard index known as Bray-Curtis index for multisets. We describe nSimGram, a suite of fast algorithms for node similarity with q-grams, based on a novel blend of color coding, probabilistic counting, sketches, and string algorithms, where the universe of elements to sample is exponential. We provide experimental evidence that our measure is effective and our running times scale to deal with large real-world networks.
Type de document :
Communication dans un congrès
KDD 2018 - ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 2018, London, United Kingdom. pp.1282-1291, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 〈10.1145/3219819.3220085〉
Liste complète des métadonnées

Littérature citée [4 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01964716
Contributeur : Marie-France Sagot <>
Soumis le : jeudi 3 janvier 2019 - 12:10:22
Dernière modification le : mercredi 23 janvier 2019 - 08:26:37

Identifiants

Collections

Citation

Alessio Conte, Gaspare Ferraro, Roberto Grossi, Andrea Marino, Kunihiko Sadakane, et al.. Node Similarity with q -Grams for Real-World Labeled Networks. KDD 2018 - ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 2018, London, United Kingdom. pp.1282-1291, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 〈10.1145/3219819.3220085〉. 〈hal-01964716〉

Partager

Métriques

Consultations de la notice

13