Fingerprinting Big Data: The Case of KNN Graph Construction - Archive ouverte HAL Access content directly
Reports (Research Report) Year : 2018

Fingerprinting Big Data: The Case of KNN Graph Construction

Empreintes numériques et Big Data: le cas de la construction des graphes KNN

(1) , (1, 2) , (3) , (3)


We propose fingerprinting, a new technique that consists in constructing compact, fast-to-compute and privacy-preserving binary representations of datasets. We illustrate the effectiveness of our approach on the emblematic big data problem of K-Nearest-Neighbor (KNN) graph construction and show that fingerprinting can drastically accelerate a large range of existing KNN algorithms, while efficiently obfuscating the original data, with little to no overhead. Our extensive evaluation of the resulting approach (dubbed GoldFinger) on several realistic datasets shows that our approach delivers speedups of up to 78.9% compared to the use of raw data while only incurring a negligible to moderate loss in terms of KNN quality. To convey the practical value of such a scheme, we apply it to item recommendation, and show that the loss in recommendation quality is negligible.
Fichier principal
Vignette du fichier
RR-9218.pdf (2.18 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-01904341 , version 1 (24-10-2018)


  • HAL Id : hal-01904341 , version 1


Rachid Guerraoui, Anne-Marie Kermarrec, Olivier Ruas, François Taïani. Fingerprinting Big Data: The Case of KNN Graph Construction. [Research Report] RR-9218, INRIA Rennes - Bretagne Atlantique; INRIA - IRISA - PANAMA; Université de Rennes 1; EPFL; Mediego. 2018, pp.1-30. ⟨hal-01904341⟩
139 View
284 Download


Gmail Facebook Twitter LinkedIn More