Towards Bounding Sequential Patterns

Chedy Raïssi 1 Jian Pei 2
1 ORPAILLEUR - Knowledge representation, reasonning
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Given a sequence database, can we have a non-trivial upper bound on the number of sequential patterns? The problem of bounding sequential patterns is very challenging in theory due to the combinatorial complexity of sequences, even given some inspiring results on bounding itemsets in frequent itemset mining. Moreover, the problem is highly meaningful in practice, since the upper bound can be used in many applications such as space allocation in building sequence data warehouses. In this paper, we tackle the problem of bounding sequential patterns by presenting, for the first time in the field of sequential pattern mining, strong combinatorial results on computing the number of possible sequential patterns that can be generated at a given length k. We introduce, as a case study, two novel techniques to estimate the number of candidate sequences. An extensive empirical study on both real data and synthetic data verifies the effectiveness of our methods.
Document type :
Conference papers
Complete list of metadatas

Cited literature [25 references]  Display  Hide  Download

https://hal.inria.fr/inria-00623550
Contributor : Chedy Raïssi <>
Submitted on : Wednesday, September 14, 2011 - 3:47:30 PM
Last modification on : Thursday, January 11, 2018 - 6:19:54 AM
Long-term archiving on : Thursday, March 30, 2017 - 2:58:32 PM

File

p1379.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Chedy Raïssi, Jian Pei. Towards Bounding Sequential Patterns. 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD-2011, Aug 2011, San Diego, United States. ⟨10.1145/2020408.2020612⟩. ⟨inria-00623550⟩

Share

Metrics

Record views

357

Files downloads

270