A new approach to detect Interactions Involving Lipid Genes by Combining Data Mining and Statistics in the STANISLAS Cohort

Abstract : Knowledge Discovery in Databases (KDD) is the analysis of large sets of observational data to find unsuspected relationships and to summarize the data in novel ways that may be both understandable and useful. Data mining is the central step of the KDD process, where algorithms are run for extracting the relationships and summaries derived through the KDD process and referred as models or patterns [1]. We aimed to identify new interactions in the domain of lipid genetics by using an approach combining Data Mining and Statistics. The population studied consisted of 772 men and 780 women from the STANISLAS cohort [2]. The data mining methods used in our experiments were based on the Close algorithm for extracting closed frequent patterns and association rules [3]. After a preliminary work on the whole genetic biological and clinical data, we focused on sub samples related to APOB and APOE genes. The corresponding rules suggested hypotheses validated by Statistics. In men, a significant interaction was found between an environmental factor and APOBThr71Ile polymorphism on LDL-cholesterol concentration. In women, we detected an interaction between APOE and APOB genes suggesting a potential protective genetic profile. In conclusion, our results highlight the importance of considering interactions involving lipid metabolism genes when studying markers of cardiovascular diseases. This is an original and promising way of combining data mining methods and statistics. We are currently working on this combination and we plan to obtain a general methodology for mining biological data and establishing new potential disease susceptibility profiles. References [1] Hand D, Mannila H, Smyth P. Principles of Data Mining, MIT Press, Cambridge, 2001. [2] Siest G, Visvikis S, Herbeth B, Gueguen G, Vincent-Viry M et al. CCLM 1998; 36:35-42. [3] Pasquier N, Bastide Y, Taouil R, Lakhal L. Int J Information Systems 1999; 24:25-46.
Type de document :
Communication dans un congrès
Biologie Prospective - Santorini Conference, 2004, Santorini, Greece, 1 p, 2004
Liste complète des métadonnées

https://hal.inria.fr/inria-00100207
Contributeur : Publications Loria <>
Soumis le : mardi 26 septembre 2006 - 10:15:30
Dernière modification le : jeudi 11 janvier 2018 - 06:19:52

Identifiants

  • HAL Id : inria-00100207, version 1

Collections

Citation

Sandy Maumus, Amedeo Napoli, Catherine Sass, Eliane Albuisson, Sophie Visvikis. A new approach to detect Interactions Involving Lipid Genes by Combining Data Mining and Statistics in the STANISLAS Cohort. Biologie Prospective - Santorini Conference, 2004, Santorini, Greece, 1 p, 2004. 〈inria-00100207〉

Partager

Métriques

Consultations de la notice

175