Skip to Main content Skip to Navigation
Theses

Mining complex data and biclustering using formal concept analysis

Nyoman Juniarta 1
1 ORPAILLEUR - Knowledge representation, reasonning
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Knowledge discovery in database (KDD) is a process which is applied to possibly large volumes of data for discovering patterns which can be significant and useful. In this thesis, we are interested in data transformation and data mining in knowledge discovery applied to complex data, and we present several experiments related to different approaches and different data types. The first part of this thesis focuses on the task of biclustering using formal concept analysis (FCA) and pattern structures. FCA is naturally related to biclustering, where the objective is to simultaneously group rows and columns which verify some regularities. Related to FCA, pattern structures are its generalizations which work on more complex data. Partition pattern structures were proposed to discover constant-column biclustering, while interval pattern structures were studied in similar-column biclustering. Here we extend these approaches to enumerate other types of biclusters: additive, multiplicative, order-preserving, and coherent-sign-changes. The second part of this thesis focuses on two experiments in mining complex data. First, we present a contribution related to the CrossCult project, where we analyze a dataset of visitor trajectories in a museum. We apply sequence clustering and FCA-based sequential pattern mining to discover patterns in the dataset and to classify these trajectories. This analysis can be used within CrossCult project to build recommendation systems for future visitors. Second, we present our work related to the task of antibacterial drug discovery. The dataset for this task is generally a numerical matrix with molecules as rows and features/attributes as columns. The huge number of features makes it more complex for any classifier to perform molecule classification. Here we study a feature selection approach based on log-linear analysis which discovers associations among features. As a synthesis, this thesis presents a series of different experiments in the mining of complex real-world data.
Document type :
Theses
Complete list of metadata

Cited literature [120 references]  Display  Hide  Download

https://hal.inria.fr/tel-02426034
Contributor : Nyoman Juniarta <>
Submitted on : Wednesday, January 1, 2020 - 10:14:38 AM
Last modification on : Saturday, March 7, 2020 - 1:15:11 AM
Long-term archiving on: : Thursday, April 2, 2020 - 2:06:51 PM

File

main.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-02426034, version 1

Citation

Nyoman Juniarta. Mining complex data and biclustering using formal concept analysis. Computer Science [cs]. Université de Lorraine, 2019. English. ⟨NNT : 2019LORR0199⟩. ⟨tel-02426034⟩

Share

Metrics

Record views

159

Files downloads

285