A Privacy-Preserving Framework for Large-Scale Content-Based Information Retrieval

Li Weng; Laurent Amsaleg; April Morton; Stéphane Marchand-Maillet

doi:10.1109/TIFS.2014.2365998

Article Dans Une Revue IEEE Transactions on Information Forensics and Security Année : 2014

A Privacy-Preserving Framework for Large-Scale Content-Based Information Retrieval

(1) , (2) , (3) , (3)

1
2
3

Li Weng

Fonction : Auteur
PersonId : 3608
IdHAL : li-weng

Creating and exploiting explicit links between multimedia fragments

Laurent Amsaleg

Fonction : Auteur
PersonId : 15318
IdHAL : laurent-amsaleg
ORCID : 0000-0003-0204-0930
IdRef : 154720879

Multimedia content-based indexing

April Morton

Fonction : Auteur

Computer Vision and Multimedia Laboratory [Geneve]

Stéphane Marchand-Maillet

Fonction : Auteur

Computer Vision and Multimedia Laboratory [Geneve]

Résumé

We propose a privacy protection framework for large-scale content-based information retrieval. It offers two layers of protection. First, robust hash values are used as queries to prevent revealing original content or features. Second, the client can choose to omit certain bits in a hash value to further increase the ambiguity for the server. Due to the reduced information, it is computationally difficult for the server to know the client’s interest. The server has to return the hash values of all possible candidates to the client. The client performs a search within the candidate list to find the best match. Since only hash values are exchanged between the client and the server, the privacy of both parties is protected. We introduce the concept of tunable privacy, where the privacy protection level can be adjusted according to a policy. It is realized through hash-based piece-wise inverted indexing. The idea is to divide a feature vector into pieces and index each piece with a sub-hash value. Each sub-hash value is associated with an inverted index list. The framework has been extensively tested using a large image database. We have evaluated both retrieval performance and privacy-preserving performance for a particular content identification application. Two different constructions of robust hash algorithms are used. One is based on random projections; the other is based on the discrete wavelet transform. Both algorithms exhibit satisfactory performance in comparison with state-of-the-art retrieval schemes. The results show that the privacy enhancement slightly improves the retrieval performance. We consider the majority voting attack for estimating the query category and ID. Experiment results show that this attack is a threat when there are near-duplicates, but the success rate decreases with the number of omitted bits and the number of distinct items.cont

Mots clés

privacy security large-scale Content-Based Image Retrieval indexing image hashing multimedia database data privacy

Domaines

Cryptographie et sécurité [cs.CR] Multimédia [cs.MM]

Laurent Amsaleg : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01083328

Soumis le : lundi 17 novembre 2014-10:10:29

Dernière modification le : dimanche 7 janvier 2024-18:56:06

Dates et versions

hal-01083328 , version 1 (17-11-2014)

Identifiants

HAL Id : hal-01083328 , version 1
DOI : 10.1109/TIFS.2014.2365998

Citer

Li Weng, Laurent Amsaleg, April Morton, Stéphane Marchand-Maillet. A Privacy-Preserving Framework for Large-Scale Content-Based Information Retrieval. IEEE Transactions on Information Forensics and Security, 2014, 10, pp.152-167. ⟨10.1109/TIFS.2014.2365998⟩. ⟨hal-01083328⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA IRISA-D6 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE ANR UR1-MATH-NUM

330 Consultations

0 Téléchargements

A Privacy-Preserving Framework for Large-Scale Content-Based Information Retrieval

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager