A Data Mining Approach to Discover Genetic and Environmental Factors involved in Multifactoral Diseases

L. Jourdan 1, 2, * C. Dhaenens 1 E.G. Talbi 1, * S. Gallina 3
* Corresponding author
1 DOLPHIN - Parallel Cooperative Multi-criteria Optimization
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe
Abstract : In this paper, we are interested in discovering genetic factors that are involved in multifactorial diseases. Therefore, experiments have been achieved by the Biological Institute of Lille and a lot of data has been generated. To exploit this data, data mining tools are required and we propose a 2-phase optimization approach using a specific genetic algorithm. During the first step, we select significant features with a specific genetic algorithm. Then, during the second step, we cluster affected individuals according to the features selected by the first phase. The paper describes the specificities of the genetic problem that we are studying and presents in details the genetic algorithm that we have developed to deal with this very large size problem of feature selection. Results on both artificial and real data are presented.
Document type :
Journal articles
Complete list of metadatas

Cited literature [11 references]  Display  Hide  Download

https://hal.inria.fr/inria-00001181
Contributor : Laetitia Jourdan <>
Submitted on : Thursday, March 30, 2006 - 1:12:27 PM
Last modification on : Monday, September 9, 2019 - 11:00:49 AM
Long-term archiving on : Wednesday, September 8, 2010 - 4:20:17 PM

Identifiers

Citation

L. Jourdan, C. Dhaenens, E.G. Talbi, S. Gallina. A Data Mining Approach to Discover Genetic and Environmental Factors involved in Multifactoral Diseases. Knowledge-Based Systems, Elsevier, 2002, 15 (4), pp.235--242. ⟨10.1016/S0950-7051(01)00145-9⟩. ⟨inria-00001181⟩

Share

Metrics

Record views

296

Files downloads

193