T. A. Berson and . Eu-rocrypt, Differential Cryptanalysis Mod 232 with Applications to MD5, pp.71-80, 1992.
DOI : 10.1007/3-540-47555-9_6

W. Zhe, Clean-living: Eliminating Near-Duplicates in lifetime Personal Storage, 2005.

J. P. Kumar, Duplicate and Near Duplicate Documents Detection: A Review, European Journal of Scientific Research, 2009.

M. Udi, Finding Similar Files in a Large File System, USENIX Winter Technical conference, 1994.

Z. Andrei, Some applications of Rabin's fingerprinting method, Sequences II: Methods in Communications, Security, and Computer Science, 1993.

A. Chowdhury, Collection statistics for fast duplicate document detection, ACM Transactions on Information Systems, vol.20, issue.2, pp.171-191, 2002.
DOI : 10.1145/506309.506311

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.5.3673

A. Z. Broder, Identifying and Filtering Near-Duplicate Documents, Proceedings of COM '00
DOI : 10.1007/3-540-45123-4_1

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.365.5357

L. Gravano, Approximate string joins in a database (almost) for free, 2001.

. Ilinsky, An efficient method to detect duplicates of Web documents with the use of inverted index

. Ferro, An Efficient Duplicate Record Detection Using q-Grams Array Inverted Index
DOI : 10.1007/978-3-642-15105-7_25

P. Indyk, Approximate nearest neighbors, Proceedings of the thirtieth annual ACM symposium on Theory of computing , STOC '98
DOI : 10.1145/276698.276876

A. Kolcz, Improved robustness of signature-based near replica detection via lexicon randomization, KDD, 2004.