H. Brim, S. Mcfarlan, J. Fredrickson, K. Minton, M. Zhai et al., Engineering Deinococcus radiodurans for metal remediation in radioactive mixed waste environments, Nature Biotechnology, vol.54, issue.1, pp.85-90, 2000.
DOI : 10.1021/bi00809a024

H. Brim, A. Venkateswaran, H. Kostandarithes, J. Fredrickson, and M. Daly, Engineering Deinococcus geothermalis for Bioremediation of High-Temperature Radioactive Waste Environments, Applied and Environmental Microbiology, vol.69, issue.8, pp.4575-4582, 2003.
DOI : 10.1128/AEM.69.8.4575-4582.2003

URL : http://aem.asm.org/content/69/8/4575.full.pdf

P. Gabani and O. Singh, Radiation-resistant extremophiles and their potential in biotechnology and therapeutics, Applied Microbiology and Biotechnology, vol.80, issue.6, pp.993-1004, 2013.
DOI : 10.1371/journal.pgen.1000645

O. Singh and P. Gabani, Extremophiles: radiation resistance microbial reserves and therapeutic implications, Journal of Applied Microbiology, vol.86, issue.Pt 9, pp.851-861, 2011.
DOI : 10.1016/j.jphotobiol.2006.10.006

URL : http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2672.2011.04971.x/pdf

S. Needleman and C. Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins, Journal of Molecular Biology, vol.48, issue.3, pp.443-453, 1970.
DOI : 10.1016/0022-2836(70)90057-4

H. Sghaier, K. Ghedira, A. Benkahla, and I. Barkallah, Basal DNA repair machinery is subject to positive selection in ionizing-radiation-resistant bacteria, BMC Genomics, vol.9, issue.1, p.297, 2008.
DOI : 10.1186/1471-2164-9-297

URL : https://hal.archives-ouvertes.fr/hal-01358559

K. Makarova, M. Omelchenko, E. Gaidamakova, V. Matrosova, A. Vasilenko et al., Deinococcus geothermalis: The Pool of Extreme Radiation Resistance Genes Shrinks, PLoS ONE, vol.46, issue.5, p.955, 2007.
DOI : 10.1371/journal.pone.0000955.s020

URL : https://doi.org/10.1371/journal.pone.0000955

A. Alexeyenko, I. Tamas, G. Liu, and E. Sonnhammer, Automatic clustering of orthologs and inparalogs shared by multiple proteomes, Bioinformatics, vol.22, issue.14, pp.9-15, 2006.
DOI : 10.1093/bioinformatics/btl213

URL : https://academic.oup.com/bioinformatics/article-pdf/22/14/e9/614830/btl213.pdf

J. Thompson, D. Higgins, and T. Gibson, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Research, vol.22, issue.22, pp.4673-4680, 1994.
DOI : 10.1093/nar/22.22.4673

URL : https://academic.oup.com/nar/article-pdf/22/22/4673/7122285/22-22-4673.pdf

P. Librado and J. Rozas, DnaSP v5: a software for comprehensive analysis of DNA polymorphism data, Bioinformatics, vol.4, issue.14, pp.1451-1452, 2009.
DOI : 10.1186/1471-2105-4-6

URL : https://academic.oup.com/bioinformatics/article-pdf/25/11/1451/950342/btp187.pdf

R. Banerjee, A. Roy, and S. Mukhopadhyay, Genomic and proteomic signatures of radiation and thermophilic adaptation in the Deinococcus-Thermus genomes, International Journal of Pharmacy and Pharmaceutical Sciences, vol.6, pp.287-300, 2014.

J. Peden, CodonW software Available from: http://codonw.sourceforge.net, 1997.

S. Inc, SPSS software for windows (version 15.0), 2007.

S. Inc, Statistica 6 software Available from: http://www.statsoft.com/Products, 2002.

T. Vesth, K. Lagesen, A. Acar¨o, and D. Ussery, CMG-Biotools, a Free Workbench for Basic Comparative Microbial Genomics, PLoS ONE, vol.12, issue.4, p.60120, 2013.
DOI : 10.1371/journal.pone.0060120.s002

URL : http://doi.org/10.1371/journal.pone.0060120

S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman, Basic local alignment search tool, Journal of Molecular Biology, vol.215, issue.3, pp.403-410, 1990.
DOI : 10.1016/S0022-2836(05)80360-2

Z. Yang, PAML 4: phylogenetic analysis by maximum likelihood Molecular biology and evolution, pp.1586-1591, 2007.
DOI : 10.1093/molbev/msm088

URL : https://academic.oup.com/mbe/article-pdf/24/8/1586/3853532/msm088.pdf

R. Tatusov, D. Natale, I. Garkavtsev, T. Tatusova, U. Shankavaram et al., The COG database: new developments in phylogenetic classification of proteins from complete genomes, Nucleic Acids Research, vol.29, issue.1, pp.22-28, 2001.
DOI : 10.1093/nar/29.1.22

V. Markowitz, I. Chen, K. Chu, E. Szeto, K. Palaniappan et al., IMG/M: the integrated metagenome data management and comparative analysis system, Nucleic Acids Research, vol.26, issue.5, pp.123-129, 2011.
DOI : 10.1038/nbt1360

URL : https://academic.oup.com/nar/article-pdf/40/D1/D123/9483611/gkr975.pdf

I. Shuryak and E. Dadachova, Quantitative Modeling of Microbial Population Responses to Chronic Irradiation Combined with Other Stressors, PLOS ONE, vol.14, issue.1, p.147696, 2016.
DOI : 10.1371/journal.pone.0147696.s004

URL : https://doi.org/10.1371/journal.pone.0147696

J. Fredrickson, J. Zachara, D. Balkwill, D. Kennedy, W. Shu-mei et al., Geomicrobiology of High-Level Nuclear Waste-Contaminated Vadose Sediments at the Hanford Site, Washington State, Applied and Environmental Microbiology, vol.70, issue.7, pp.4230-4241, 2004.
DOI : 10.1128/AEM.70.7.4230-4241.2004

URL : http://aem.asm.org/content/70/7/4230.full.pdf

N. Mantel, The detection of disease clustering and a generalized regression approach. Cancer research, pp.209-220, 1967.

S. Dray and A. Dufour, The ade4 package: implementing the duality diagram for ecologists, Journal of statistical software, vol.22, issue.4, pp.1-20, 2007.
DOI : 10.18637/jss.v022.i04

URL : https://hal.archives-ouvertes.fr/hal-00434575

L. Breiman, Random forests, Machine Learning, vol.45, issue.1, pp.5-32, 2001.
DOI : 10.1023/A:1010933404324

L. Song, P. Langfelder, and S. Horvath, Random generalized linear model: a highly accurate and interpretable ensemble predictor, BMC Bioinformatics, vol.14, issue.1, p.5, 2013.
DOI : 10.1016/j.patcog.2006.06.027

URL : http://doi.org/10.1186/1471-2105-14-5

K. Burnham and D. Anderson, Model selection and multimodel inference: a practical information-theoretic approach, 2003.
DOI : 10.1007/b97636

K. Burnham, D. Anderson, and K. Huyvaert, AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons, Behavioral Ecology and Sociobiology, vol.264, issue.1, pp.23-35, 2011.
DOI : 10.1098/rspb.1997.0075

M. Zoghlami, S. Aridhi, H. Sghaier, M. Maddouri, and E. Nguifo, A multiple instance learning approach for sequence data with across bag dependencies, CoRR, 2016.

Z. Xing, P. J. Keogh, and E. , A brief survey on sequence classification, ACM SIGKDD Explorations Newsletter, vol.12, issue.1, pp.40-48, 2010.
DOI : 10.1145/1882471.1882478

URL : http://www.cs.sfu.ca/%7Ejpei/publications/Sequence%20Classification.pdf

J. Amores, Multiple instance classification: Review, taxonomy and comparative study, Artificial Intelligence, vol.201, pp.81-105, 2013.
DOI : 10.1016/j.artint.2013.06.003

URL : https://doi.org/10.1016/j.artint.2013.06.003

E. Alpayd?n, V. Cheplygina, M. Loog, and D. Tax, Single- vs. multiple-instance classification, Pattern Recognition, vol.48, issue.9, pp.2831-2838, 2015.
DOI : 10.1016/j.patcog.2015.04.006

S. Andrews, I. Tsochantaridis, and T. Hofmann, Support Vector Machines for Multiple-Instance Learning, Advances in Neural Information Processing Systems, pp.561-568, 2003.

O. Maron, T. Pérez, M. Jordan, M. Kearns, and S. Solla, A Framework for Multiple-Instance Learning, Advances in Neural Information Processing Systems, pp.570-576, 1998.

A. Faria, F. Coelho, A. Silva, H. Rocha, G. Almeida et al., MILKDE: A new approach for multiple instance learning based on positive instance selection and kernel density estimation, Engineering Applications of Artificial Intelligence, vol.59, pp.196-204, 2017.
DOI : 10.1016/j.engappai.2016.12.015

R. Bunescu and R. Mooney, Multiple instance learning for sparse positive bags, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.105-112, 2007.
DOI : 10.1145/1273496.1273510

URL : http://www.cs.utexas.edu/~razvan/papers/icml07.pdf

J. Wang, Solving the multiple-instance problem: A lazy learning approach, Proc. 17th International conference on Machine Learning, pp.1119-1125, 2000.

G. Shakhnarovich, T. Darrell, and P. Indyk, Nearest-neighbor methods in learning and vision, IEEE transactions on neural networks, vol.19, issue.2, p.377, 2008.

V. Cheplygina, D. Tax, and M. Loog, Multiple instance learning with bag dissimilarities, Pattern Recognition, vol.48, issue.1, pp.264-275, 2015.
DOI : 10.1016/j.patcog.2014.07.022

URL : http://arxiv.org/pdf/1309.5643

M. Maddouri and M. Elloumi, Encoding of primary structures of biological macromolecules within a data mining perspective, Journal of Computer Science and Technology, vol.264, issue.32, pp.78-88, 2004.
DOI : 10.1007/BF02944786

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann et al., The WEKA data mining software, ACM SIGKDD Explorations Newsletter, vol.11, issue.1, pp.10-18, 2009.
DOI : 10.1145/1656274.1656278

J. Platt, Fast training of support vector machines using sequential minimal optimization In: Advances in kernel methods, pp.185-208, 1999.

S. Keerthi, S. Shevade, C. Bhattacharyya, and K. Murthy, Improvements to Platt's SMO Algorithm for SVM Classifier Design, Neural Computation, vol.13, issue.3, pp.637-649, 2001.
DOI : 10.1080/10556789208805504

J. Quinlan and . C4, 5: programs for machine learning, 1993.

G. John and P. Langley, Estimating continuous distributions in Bayesian classifiers, Proc. 11th conference on uncertainty in artificial intelligence, pp.338-345, 1995.

S. Aridhi, H. Sghaier, M. Zoghlami, M. Maddouri, and E. Nguifo, Prediction of Ionizing Radiation Resistance in Bacteria Using a Multiple Instance Learning Model, Journal of Computational Biology, vol.23, issue.1, pp.10-20, 2016.
DOI : 10.1089/cmb.2015.0134

URL : https://hal.archives-ouvertes.fr/hal-01807946

M. Woolfit and L. Bromham, Increased Rates of Sequence Evolution in Endosymbiotic Bacteria and Fungi with Small Effective Population Sizes, Molecular Biology and Evolution, vol.20, issue.9, pp.1545-1555, 2003.
DOI : 10.1093/molbev/msg167

URL : https://academic.oup.com/mbe/article-pdf/20/9/1545/2991640/msg167.pdf