Skip to Main content Skip to Navigation
Conference papers

Data Corpora for Digital Forensics Education and Research

Abstract : Data corpora are very important for digital forensics education and research. Several corpora are available to academia; these range from small manually-created data sets of a few megabytes to many terabytes of real-world data. However, different corpora are suited to different forensic tasks. For example, real data corpora are often desirable for testing forensic tool properties such as effectiveness and efficiency, but these corpora typically lack the ground truth that is vital to performing proper evaluations. Synthetic data corpora can support tool development and testing, but only if the methodologies for generating the corpora guarantee data with realistic properties.This paper presents an overview of the available digital forensic corpora and discusses the problems that may arise when working with specific corpora. The paper also describes a framework for generating synthetic corpora for education and research when suitable real-world data is not available.
Document type :
Conference papers
Complete list of metadata

Cited literature [23 references]  Display  Hide  Download
Contributor : Hal Ifip Connect in order to contact the contributor
Submitted on : Tuesday, November 8, 2016 - 10:52:17 AM
Last modification on : Tuesday, October 19, 2021 - 12:49:45 PM
Long-term archiving on: : Tuesday, March 14, 2017 - 11:58:02 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License



york yannikos, Lukas Graner, Martin Steinebach, Christian Winter. Data Corpora for Digital Forensics Education and Research. 10th IFIP International Conference on Digital Forensics (DF), Jan 2014, Vienna, Austria. pp.309-325, ⟨10.1007/978-3-662-44952-3_21⟩. ⟨hal-01393787⟩



Record views


Files downloads