Abstract : Recently, email spam has been noticed as a covert communication platform for criminals. However, investigators tend to overlook this kind of evidence during an investigation, and searching for incriminating information from unstructured textual data is one of the most cumbersome missions due to characteristics of email spam. This paper is the first work that presents a unified text mining solution to detect digital evidence from spam emails. It is helpful in the initial stage of investigation, in which investigators often have little information on the collection of spam emails. Our proposed solution applies a topic modeling technique, Latent Dirichlet Allocation, and a text visualization technique to discover various suspicious emails based on different camouflage methods. We present experimental results on a data set collected by the Spam Archive, which comprises 100 random spam emails. The results suggest that the proposed method is able to identify potential evidence.
https://hal.inria.fr/hal-01614994
Contributor : Hal Ifip
<>
Submitted on : Wednesday, October 11, 2017 - 4:57:59 PM
Last modification on : Wednesday, March 28, 2018 - 1:26:01 PM
Long-term archiving on: Friday, January 12, 2018 - 3:22:26 PM
Bo Yang, Jianguo Jiang, Ning Li. Towards Discovering Covert Communication Through Email Spam. 9th International Conference on Intelligent Information Processing (IIP), Nov 2016, Melbourne, VIC, Australia. pp.191-201, ⟨10.1007/978-3-319-48390-0_20⟩. ⟨hal-01614994⟩