B. Alipanahi, A. Delong, M. T. Weirauch, and B. J. Frey, Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning, Nature Biotechnology, vol.13, issue.8, pp.831-838, 2015.
DOI : 10.1126/science.1162327

C. Angermueller, T. Pärnamaa, L. Parts, and O. Stegle, Deep learning for computational biology, Molecular Systems Biology, vol.12, issue.7, p.878, 2016.
DOI : 10.15252/msb.20156651

URL : http://msb.embopress.org/content/msb/12/7/878.full.pdf

T. K. Attwood, M. E. Beck, D. R. Flower, P. Scordis, and J. Selley, The PRINTS protein fingerprint database in its fifth year, Nucleic Acids Research, vol.26, issue.1, pp.304-308, 1998.
DOI : 10.1093/nar/26.1.304

URL : https://academic.oup.com/nar/article-pdf/26/1/304/7048381/26-1-304.pdf

T. L. Bailey, DREME: motif discovery in transcription factor ChIP-seq data, Bioinformatics, vol.27, issue.12, pp.1653-1659, 2011.
DOI : 10.1093/bioinformatics/btr261

URL : https://academic.oup.com/bioinformatics/article-pdf/27/12/1653/17124808/btr261.pdf

P. Baldi, Y. Chauvin, T. Hunkapiller, and M. A. Mcclure, Hidden Markov models of biological primary sequence information., Proceedings of the National Academy of Sciences, pp.1059-1063, 1994.
DOI : 10.1073/pnas.91.3.1059

URL : http://www.pnas.org/content/91/3/1059.full.pdf

A. Ben-hur, C. S. Ong, S. Sonnenburg, B. Schölkopf, and G. Rätsch, Support Vector Machines and Kernels for Computational Biology, PLoS Computational Biology, vol.14, issue.10, 2008.
DOI : 10.1371/journal.pcbi.1000173.t002

URL : https://doi.org/10.1371/journal.pcbi.1000173

A. Bietti and J. Mairal, Group invariance and stability to deformations of deep convolutional representations . arXiv preprint, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01536004

R. Caruna, Multitask Learning: A Knowledge-Based Source of Inductive Bias, International Conference on Machine Learning (ICML), 1993.
DOI : 10.1016/B978-1-55860-307-3.50012-5

D. Castelvecchi, Can we open the black box of AI?, Nature, vol.538, issue.7623, pp.20-23, 2016.
DOI : 10.1038/538020a

R. Collobert and J. Weston, A unified architecture for natural language processing, Proceedings of the 25th international conference on Machine learning, ICML '08, 2008.
DOI : 10.1145/1390156.1390177

A. Drouin, S. Gigù-ere, M. Déraspe, M. Marchand, M. Tyers et al., Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons, BMC Genomics, vol.6, issue.3, p.754, 2016.
DOI : 10.1111/1574-6976.12036

URL : https://bmcgenomics.biomedcentral.com/track/pdf/10.1186/s12864-016-2889-6?site=bmcgenomics.biomedcentral.com

R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin, Liblinear: A library for large linear classification, Journal of Machine Learning Research (JMLR), vol.9, pp.1871-1874, 2008.

S. J. Hanson and L. Y. Pratt, Comparing biases for minimal network construction with backpropagation, Advances in Neural Information Processing Systems (NIPS), pp.177-185, 1989.

T. Jaakkola, M. Diekhans, and D. Haussler, A Discriminative Framework for Detecting Remote Protein Homologies, Journal of Computational Biology, vol.7, issue.1-2, pp.95-114, 2000.
DOI : 10.1089/10665270050081405

URL : http://galileo.gmu.edu/~lhunter/reprints/haussler.ps

A. Jha, M. R. Gazzara, and Y. Barash, Integrative deep models for alternative splicing, Bioinformatics, vol.33, issue.14, pp.274-282, 2017.
DOI : 10.1093/bioinformatics/btx268

D. R. Kelley, J. Snoek, and J. L. Rinn, Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks, Genome Research, vol.26, issue.7, pp.990-999, 2016.
DOI : 10.1101/gr.200535.115

D. Kingma and J. Ba, Adam: A method for stochastic optimization. arXiv preprint, 2014.

P. W. Koh, E. Pierson, and A. Kundaje, Denoising genome-wide histone ChIP-seq with convolutional neural networks, Bioinformatics, vol.33, issue.14, pp.225-233, 2017.
DOI : 10.1093/bioinformatics/btx243

A. Krogh, M. Brown, I. S. Mian, K. Sjölander, and D. Haussler, Hidden Markov Models in Computational Biology, Journal of Molecular Biology, vol.235, issue.5, pp.1501-1531, 1994.
DOI : 10.1006/jmbi.1994.1104

A. Krogh and J. A. Hertz, A simple weight decay can improve generalization, Advances in Neural Information Processing Systems (NIPS), pp.950-957, 1992.

J. Lanchantin, R. Singh, B. Wang, and Y. Qi, DEEP MOTIF DASHBOARD: VISUALIZING AND UNDERSTANDING GENOMIC SEQUENCES USING DEEP NEURAL NETWORKS, Biocomputing 2017, 2016.
DOI : 10.1142/9789813207813_0025

Y. Lecun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol.9, issue.7553, pp.436-444, 2015.
DOI : 10.1007/s10994-013-5335-x

H. Lee, P. Pham, Y. Largman, and A. Y. Ng, Unsupervised feature learning for audio classification using convolutional deep belief networks, Advances in Neural Information Processing Systems (NIPS), 2009.
DOI : 10.1145/2001269.2001295

C. Leslie, E. Eskin, J. Weston, and W. Noble, Mismatch String Kernels for SVM Protein Classification, Advances in Neural Information Processing Systems 15, 2003.
DOI : 10.1093/bioinformatics/btg431

URL : https://academic.oup.com/bioinformatics/article-pdf/20/4/467/476867/btg431.pdf

D. C. Liu and J. Nocedal, On the limited memory BFGS method for large scale optimization, Mathematical Programming, pp.503-528, 1989.
DOI : 10.1007/BF01589116

URL : http://www.ece.northwestern.edu/~nocedal/PDFfiles/limited-memory.pdf

A. Mathelier, O. Fornes, D. J. Arenillas, C. Chen, G. Denay et al., JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles, Nucleic Acids Research, vol.44, issue.D1, pp.44-110, 2016.
DOI : 10.1093/nar/gkv1176

URL : https://hal.archives-ouvertes.fr/hal-01281181

A. Morrow, V. Shankar, D. Petersohn, A. Joseph, B. Recht et al., Convolutional kitchen sinks for transcription factor binding site prediction. arXiv preprint, 2017.

N. Murray and F. Perronnin, Generalized Max Pooling, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.317

URL : http://arxiv.org/pdf/1406.0312

F. Picard, J. Cadoret, B. Audit, A. Arneodo, A. Alberti et al., The Spatiotemporal Program of DNA Replication Is Associated with Specific Combinations of Chromatin Marks in Human Cells, PLoS Genetics, vol.15, issue.5, p.1004282, 2014.
DOI : 10.1371/journal.pgen.1004282.s014

URL : https://hal.archives-ouvertes.fr/hal-00995097

H. Saigo, J. Vert, N. Ueda, and T. Akutsu, Protein homology detection using string alignment kernels, Bioinformatics, vol.20, issue.11, pp.1682-1689, 2004.
DOI : 10.1093/bioinformatics/bth141

URL : https://hal.archives-ouvertes.fr/hal-00433587

B. Schölkopf, R. Herbrich, and A. Smola, A Generalized Representer Theorem, Computational Learning Theory, 2001.
DOI : 10.1007/3-540-44581-1_27

B. Schölkopf and A. J. Smola, Learning with kernels: support vector machines, regularization, optimization , and beyond, 2002.

J. Shawe-taylor and N. Cristianini, Kernel methods for pattern analysis, 2004.
DOI : 10.1017/CBO9780511809682

A. Shrikumar, P. Greenside, and A. Kundaje, Learning important features through propagating activation differences, International Conference on Machine Learning (ICML)

A. Shrikumar, P. Greenside, and A. Kundaje, Reverse-complement parameter sharing improves deep learning models for genomics. bioRxiv, p.103663, 2017.
DOI : 10.1101/103663

URL : http://biorxiv.org/content/biorxiv/early/2017/01/27/103663.full.pdf

S. Sinha and M. Tompa, Discovery of novel transcription factor binding sites by statistical overrepresentation, Nucleic Acids Research, vol.30, issue.24, pp.5549-5560, 2002.
DOI : 10.1093/nar/gkf669

URL : https://academic.oup.com/nar/article-pdf/30/24/5549/3751278/gkf669.pdf

A. J. Smola and B. Schölkopf, Sparse greedy matrix approximation for machine learning, International Conference on Machine Learning (ICML), 2000.

N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting, Journal of Machine Learning Research, vol.15, issue.1, pp.1929-1958, 2014.

A. J. Stewart, S. Hannenhalli, and J. B. Plotkin, Why Transcription Factor Binding Sites Are Ten Nucleotides Long, Genetics, vol.192, issue.3, pp.973-985, 2012.
DOI : 10.1534/genetics.112.143370

URL : http://www.genetics.org/content/genetics/192/3/973.full.pdf

G. D. Stormo, DNA binding sites: representation and discovery, Bioinformatics, vol.16, issue.1, pp.16-23, 2000.
DOI : 10.1093/bioinformatics/16.1.16

URL : https://academic.oup.com/bioinformatics/article-pdf/16/1/16/669871/160016.pdf

J. Wang, J. Zhuang, S. Iyer, X. Lin, M. C. Greven et al., org: a wiki-based database for transcription factor-binding data generated by the encode consortium, Nucleic Acids Research, issue.D1, pp.41-171, 2012.

C. K. Williams and M. Seeger, Using the nyström method to speed up kernel machines, Advances in Neural Information Processing Systems (NIPS), pp.682-688, 2001.

H. Zeng, M. D. Edwards, G. Liu, and D. K. Gifford, Convolutional neural network architectures for predicting DNA???protein binding, Bioinformatics, vol.32, issue.12, pp.32-121, 2016.
DOI : 10.1093/bioinformatics/btw255

URL : http://doi.org/10.1093/bioinformatics/btw255

J. Zhou and O. Troyanskaya, Predicting effects of noncoding variants with deep learning???based sequence model, Nature Methods, vol.12, issue.10, pp.931-934, 2015.
DOI : 10.1371/journal.pcbi.1001025

URL : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4768299/pdf

C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal, Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization, ACM Transactions on Mathematical Software, vol.23, issue.4, pp.550-560, 1997.
DOI : 10.1145/279232.279236