Skip to Main content Skip to Navigation
Conference papers

Towards a Text Mining Methodology Using Frequent Itemsets and Association Rule Extraction

Hacène Cherfi 1 Amedeo Napoli 1 Yannick Toussaint 1
1 ORPAILLEUR - Knowledge representation, reasonning
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This paper proposes a methodology for text mining relying on the classical knowledge discovery loop, with a number of adaptations. First, texts are indexed and prepared to be processed by frequent itemset levelwise search. Association rules are then extracted and interpreted, with respect to a set of quality measures and domain knowledge, under the control of an analyst. The article includes an experimentation on a real-world text corpus holding on molecular biology.
Document type :
Conference papers
Complete list of metadata

Cited literature [15 references]  Display  Hide  Download

https://hal.inria.fr/inria-00107723
Contributor : Publications Loria <>
Submitted on : Thursday, October 19, 2006 - 9:07:02 AM
Last modification on : Friday, February 26, 2021 - 3:28:05 PM
Long-term archiving on: : Wednesday, March 29, 2017 - 12:52:44 PM

Identifiers

  • HAL Id : inria-00107723, version 1

Collections

Citation

Hacène Cherfi, Amedeo Napoli, Yannick Toussaint. Towards a Text Mining Methodology Using Frequent Itemsets and Association Rule Extraction. Journées d'informatique Messine - JIM'03, E. SanJuan, Sep 2003, Metz, France, pp.285--294. ⟨inria-00107723⟩

Share