Skip to Main content Skip to Navigation
Journal articles

C3Ro: An efficient mining algorithm of extende d-close d contiguous robust sequential patterns in noisy data

Y Abboud 1 Armelle Brun 1 Anne Boyer 1
1 KIWI - Knowledge Information and Web Intelligence
LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : Sequential pattern mining has been the focus of many works, but still faces a tough challenge in the mining of large databases for both efficiency and apprehensibility of its resulting set. To overcome these issues, the most promising direction taken by the literature relies on the use of constraints, including the well-known closedness constraint. However, such a mining is not resistant to noise in data, a characteristic of most real-world data. The main research question raised in this paper is thus: how to efficiently mine an apprehensible set of sequential patterns from noisy data? In order to address this research question, we introduce 1) two original constraints designed for the mining of noisy data: the robustness and the extended-closedness constraints, 2) a generic pattern mining algorithm, C3Ro, designed to mine a wide range of sequential patterns, going from closed or maximal contiguous sequential patterns to closed or maximal regular sequential patterns. C3Ro is dedicated to practitioners and is able to manage their multiple constraints. C3Ro also is the first sequential pattern mining algorithm to be as generic and parameterizable. Extensive experiments have been conducted and reveal the high efficiency of C3Ro, especially in large datasets, over well-known algorithms from the literature. Additional experiments have been conducted on a real-world job offers noisy dataset, with the goal to mine activities. This experiment offers a more thorough insight into C3Ro algorithm: job market experts confirm that the constraints we introduced actually have a significant positive impact on the apprehensibility of the set of mined activities.
Document type :
Journal articles
Complete list of metadatas

Cited literature [92 references]  Display  Hide  Download

https://hal.inria.fr/hal-02977461
Contributor : Armelle Brun <>
Submitted on : Sunday, October 25, 2020 - 6:15:04 AM
Last modification on : Wednesday, October 28, 2020 - 3:36:10 AM

File

Article_VersionAuteur.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Y Abboud, Armelle Brun, Anne Boyer. C3Ro: An efficient mining algorithm of extende d-close d contiguous robust sequential patterns in noisy data. Expert Systems with Applications, Elsevier, 2019, 131, pp.172 - 189. ⟨10.1016/j.eswa.2019.04.058⟩. ⟨hal-02977461⟩

Share

Metrics

Record views

7

Files downloads

84