Abstract : The relentless increase in storage capacity and decrease in storage cost present an escalating challenge for digital forensic investigations – current forensic technologies are not designed to scale to the degree necessary to process the ever increasing volumes of digital evidence. This paper describes a similarity-digest-based approach that scales up the task of finding related digital artifacts in massive data sets. The results show that digests can be generated at rates exceeding those of cryptographic hashes on commodity multi-core computing systems. Also, the querying of the digest of a large (1 TB) target for the (trace) presence of a small file can be completed in less than one second with very high precision and recall rates.
https://hal.inria.fr/hal-01523709 Contributor : Hal IfipConnect in order to contact the contributor Submitted on : Tuesday, May 16, 2017 - 5:10:18 PM Last modification on : Thursday, March 5, 2020 - 4:46:41 PM Long-term archiving on: : Friday, August 18, 2017 - 12:10:58 AM
Vassil Roussev. Managing Terabyte-Scale Investigations with Similarity Digests. 8th International Conference on Digital Forensics (DF), Jan 2012, Pretoria, South Africa. pp.19-34, ⟨10.1007/978-3-642-33962-2_2⟩. ⟨hal-01523709⟩