Predicting Transcription Factor Binding Sites with Convolutional Kernel Networks

Abstract : The growing amount of biological sequences available makes it possible to learn genotype-phenotype relationships from data with increasingly high accuracy. By exploiting large sets of sequences with known phenotypes, machine learning methods can be used to build functions that predict the phenotype of new, unannotated sequences. In particular, deep neural networks have recently obtained good performances on such prediction tasks, but are notoriously difficult to analyze or interpret. Here, we introduce a hybrid approach between kernel methods and convolutional neural networks for sequences, which retains the ability of neural networks to learn good representations for a learning problem at hand, while defining a well characterized Hilbert space to describe prediction functions. Our method outperforms state-of-the-art convolutional neural networks on a transcription factor binding prediction task while being much faster to train and yielding more stable and interpretable results. Source code is freely available at https://gitlab.inria.fr/dchen/CKN-seq.
Type de document :
Pré-publication, Document de travail
2017
Liste complète des métadonnées

Littérature citée [47 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01632912
Contributeur : Julien Mairal <>
Soumis le : vendredi 10 novembre 2017 - 17:16:49
Dernière modification le : mardi 14 novembre 2017 - 16:18:59

Fichier

ckn_seq_paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01632912, version 1

Collections

Citation

Dexiong Chen, Laurent Jacob, Julien Mairal. Predicting Transcription Factor Binding Sites with Convolutional Kernel Networks. 2017. 〈hal-01632912〉

Partager

Métriques

Consultations de la notice

138

Téléchargements de fichiers

17