Ontology-guided data preparation for discovering genotype-phenotype relationships

Adrien Coulet 1 Malika Smaïl-Tabbone 1 Pascale Benlian Amedeo Napoli 1 Marie-Dominique Devignes 1
1 ORPAILLEUR - Knowledge representation, reasonning
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Background: Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bioontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning. Results: This paper presents a method that exploits bio-ontologies for guiding data selection within the preparation step of the KDD process. We propose three scenarios in which domain knowledge and ontology elements such as subsumption, properties, class descriptions, are taken into account for data selection, before the data mining step. Each of these scenarios is illustrated within a case-study relative to the search of genotype-phenotype relationships in a familial hypercholesterolemia dataset. The guiding of data selection based on domain knowledge is analysed and shows a direct influence on the volume and significance of the data mining results. Conclusions: The method proposed in this paper is an efficient alternative to numerical methods for data selection based on domain knowledge. In turn, the results of this study may be reused in ontology modelling and data integration.
Type de document :
Article dans une revue
BMC Bioinformatics, BioMed Central, 2008, 9 (Suppl 4), pp.S3
Liste complète des métadonnées

https://hal.inria.fr/inria-00338669
Contributeur : Malika Smail-Tabbone <>
Soumis le : jeudi 13 novembre 2008 - 23:07:25
Dernière modification le : jeudi 11 janvier 2018 - 06:19:54

Identifiants

  • HAL Id : inria-00338669, version 1

Collections

Citation

Adrien Coulet, Malika Smaïl-Tabbone, Pascale Benlian, Amedeo Napoli, Marie-Dominique Devignes. Ontology-guided data preparation for discovering genotype-phenotype relationships. BMC Bioinformatics, BioMed Central, 2008, 9 (Suppl 4), pp.S3. 〈inria-00338669〉

Partager

Métriques

Consultations de la notice

243