P. Larranaga, R. S. Robles, and V. , Machine learning in bioinformatics, Briefings in Bioinformatics, vol.20, issue.11, pp.86-112, 2006.
DOI : 10.1093/bioinformatics/bth147

P. Berkhin, A survey of clustering data mining techniques. Grouping multidimensional data, pp.25-71, 2006.

N. Belacel and M. Cuperlovic-culf, CLUSTERING: UNSUPERVISED LEARNING IN LARGE SCREENING BIOLOGICAL DATA, 2010.

W. Li and A. Godzik, Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences, Bioinformatics, vol.18, issue.3, pp.22-1658, 2006.
DOI : 10.1093/bioinformatics/17.3.282

R. C. Edgar, Search and clustering orders of magnitude faster than BLAST, Bioinformatics, vol.22, issue.19, pp.2460-2461, 2010.
DOI : 10.1093/bioinformatics/btl158

URL : https://academic.oup.com/bioinformatics/article-pdf/26/19/2460/16896486/btq461.pdf

I. Rigoutsos and A. F. , Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm [published erratum appears in Bioinformatics 1998;14(2):229], Bioinformatics, vol.14, issue.1, pp.55-67, 1998.
DOI : 10.1093/bioinformatics/14.1.55

N. Darzentas, A. Hadzidimitriou, F. Murray, K. Hatzi, P. Josefsson et al., A dif-ferent ontogenesis for chronic lymphocytic leukemia cases carrying stereotyped antigen re-ceptors: molecular and computational evidence, Leukemia, pp.125-157, 2010.
DOI : 10.1038/leu.2009.186

URL : https://www.nature.com/articles/leu2009186.pdf

Y. Cai, W. Zheng, J. Yao, Y. Yang, V. Mai et al., ESPRIT-Forest: Parallel clustering of massive amplicon sequence data in subquadratic time, PLOS Computational Biology, vol.6, issue.1, 2017.
DOI : 10.1371/journal.pcbi.1005518.s001

, IMGT, the international ImMunoGeneTics information system