Learning from biomedical linked data to suggest valid pharmacogenes - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue Journal of Biomedical Semantics Année : 2017

Learning from biomedical linked data to suggest valid pharmacogenes

Résumé

Background: A standard task in pharmacogenomics research is identifying genes that may be involved in drug response variability, i.e., pharmacogenes. Because genomic experiments tended to generate many false positives, computational approaches based on the use of background knowledge have been proposed. Until now, only molecular networks or the biomedical literature were used, whereas many other resources are available. Method: We propose here to consume a diverse and larger set of resources using linked data related either to genes, drugs or diseases. One of the advantages of linked data is that they are built on a standard framework that facilitates the joint use of various sources, and thus facilitates considering features of various origins. We propose a selection and linkage of data sources relevant to pharmacogenomics, including for example DisGeNET and Clinvar. We use machine learning to identify and prioritize pharmacogenes that are the most probably valid, considering the selected linked data. This identification relies on the classification of gene-drug pairs as either pharmacogenomically associated or not and was experimented with two machine learning methods -random forest and graph kernel-, which results are compared in this article. Results: We assembled a set of linked data relative to pharmacogenomics, of 2,610,793 triples, coming from six distinct resources. Learning from these data, random forest enables identifying valid pharmacogenes with a F-measure of 0.73, on a 10 folds cross-validation, whereas graph kernel achieves a F-measure of 0.81. A list of top candidates proposed by both approaches is provided and their obtention is discussed.
Fichier principal
Vignette du fichier
dalleau_et_al_2017 (1).pdf (1.77 Mo) Télécharger le fichier
Origine : Publication financée par une institution
Loading...

Dates et versions

hal-01511773 , version 1 (21-04-2017)

Identifiants

Citer

Kevin Dalleau, Yassine Marzougui, Sébastien da Silva, Patrice Ringot, Ndeye Coumba Ndiaye, et al.. Learning from biomedical linked data to suggest valid pharmacogenes. Journal of Biomedical Semantics, 2017, 8 (1), pp.16. ⟨10.1186/s13326-017-0125-1⟩. ⟨hal-01511773⟩
644 Consultations
202 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More