HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

C3Ro: An efficient mining algorithm of extended-closed contiguous robust sequential patterns in noisy data

Y Abboud 1 Armelle Brun 1 Anne Boyer 1
1 KIWI - Knowledge Information and Web Intelligence
LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : Sequential pattern mining has been the focus of many works, but still faces a tough challenge in the mining of large databases for both efficiency and apprehensibility of its resulting set. To overcome these issues, the most promising direction taken by the literature relies on the use of constraints, including the well-known closedness constraint. However, such a mining is not resistant to noise in data, a characteristic of most real-world data. The main research question raised in this paper is thus: how to efficiently mine an apprehensible set of sequential patterns from noisy data? In order to address this research question, we introduce 1) two original constraints designed for the mining of noisy data: the robustness and the extended-closedness constraints, 2) a generic pattern mining algorithm, C3Ro, designed to mine a wide range of sequential patterns, going from closed or maximal contiguous sequential patterns to closed or maximal regular sequential patterns. C3Ro is dedicated to practitioners and is able to manage their multiple constraints. C3Ro also is the first sequential pattern mining algorithm to be as generic and parameterizable. Extensive experiments have been conducted and reveal the high efficiency of C3Ro, especially in large datasets, over well-known algorithms from the literature. Additional experiments have been conducted on a real-world job offers noisy dataset, with the goal to mine activities. This experiment offers a more thorough insight into C3Ro algorithm: job market experts confirm that the constraints we introduced actually have a significant positive impact on the apprehensibility of the set of mined activities.
Document type :
Journal articles
Complete list of metadata

Cited literature [92 references]  Display  Hide  Download

Contributor : Armelle Brun Connect in order to contact the contributor
Submitted on : Sunday, October 25, 2020 - 6:15:04 AM
Last modification on : Wednesday, November 3, 2021 - 7:57:40 AM
Long-term archiving on: : Tuesday, January 26, 2021 - 6:02:35 PM


Files produced by the author(s)




Y Abboud, Armelle Brun, Anne Boyer. C3Ro: An efficient mining algorithm of extended-closed contiguous robust sequential patterns in noisy data. Expert Systems with Applications, Elsevier, 2019, 131, pp.172 - 189. ⟨10.1016/j.eswa.2019.04.058⟩. ⟨hal-02977461⟩



Record views


Files downloads