.. Binary, 16 2.1.1 Definition 16 2.1.2 Modeling the similarity distribution for binary scores

.. Unsupervised-named-entity-recognition, 26 3.1.1 Diverting Conditional Random Fields, p.28

C. Rate and .. , On-the-fly criterion, p.44

I. References and . Boriah, Similarity measures for categorical data: A comparative evaluation, Proceedings of the eighth SIAM International Conference on Data Mining, pp.243-254, 2008.

. Bottou, . Bengio, L. Bottou, and Y. Bengio, Convergence properties of the k-means algorithms, Advances in Neural Information Processing Systems 7, pp.585-592, 1995.

L. Breiman, Random forests, Machine Learning, vol.45, issue.1, pp.5-32, 2001.
DOI : 10.1023/A:1010933404324

G. Claveau, V. Claveau, and P. Gros, Clustering de données relationnelles pour la structuration de flux télévisuels, 2014.

. Claveau, Explorer le graphe de voisinage pour améliorer les thésaurus distributionnels, 21ème conférence sur le Traitement Automatique des Langues Naturelles, 2014.

N. Claveau, V. Claveau, and A. Ncibi, Knowledge discovery with CRFbased clustering of named entities without a priori classes, Conference on Intelligent Text Processing and Computational Linguistics, pp.415-428, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01027520

C. Cox, T. F. Cox, and M. Cox, Multidimensional Scaling, 2000.
DOI : 10.1007/978-3-540-33037-0_14

S. Junior, Learning to hash faces using large feature vectors, Workshop on Content-based Multimedia Indexing, 2015.

. Drezde, NLP on spoken documents without ASR, EMNLP, 2010.

. Ebadat, Semantic Clustering using Bag-of-Bag-of-Features, CORIA -COnférence en Recherche d'Information et Applications, pp.229-244, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00753912

W. Fernandez, M. Fernandez, and S. Williams, Closed-Form Expression for the Poisson-Binomial Probability Density Function, IEEE Transactions on Aerospace and Electronic Systems, vol.46, issue.2, pp.803-817, 2010.
DOI : 10.1109/TAES.2010.5461658

O. Ferret, Identifying bad semantic neighbors for improving distributional thesauri, The Association for Computer Linguistics, pp.561-571, 2013.

. Galliano, The ESTER 2 evaluation campaign for the rich transcription of French radio broadcasts, Conf. of the Intl. Speech Communication Association (Interspeech), pp.2583-2586, 2009.

. Gravier, Audio thumbnails for spoken content without transcription based on a maximum motif coverage criterion, Annual Conf. of the Intl. Speech Communication Association, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01026402

. Halkidi, On clustering validation techniques, Journal of Intelligent Information Systems, vol.17, issue.2/3, pp.107-145, 2001.
DOI : 10.1023/A:1012801612483

C. Hodges, J. L. Hodges, and L. L. Cam, The Poisson Approximation to the Poisson Binomial Distribution, The Annals of Mathematical Statistics, vol.31, issue.3, pp.31737-740, 1960.
DOI : 10.1214/aoms/1177705799

J. , B. Joly, A. Buisson, and O. , Random Maximum Margin Hashing, IEEE Computer Vision and Pattern Recognition, pp.873-880, 2011.
DOI : 10.1109/cvpr.2011.5995709

URL : https://hal.archives-ouvertes.fr/hal-00642178

J. , V. Juan, A. Vidal, and E. , Bernoulli mixture models for binary images, Proceedings of the 17th International Conference on Pattern Recognition, pp.367-370, 2004.

. Klinger, . Tomanek, R. Klinger, and K. Tomanek, Classical probabilistic models and conditional random fields, Algorithm Engineering Report, 2007.

B. Kulis, Metric Learning: A Survey, Machine Learning, pp.287-364, 2013.
DOI : 10.1561/2200000019

. Lafferty, Conditional random fields: Probabilistic models for segmenting and labeling sequence data, Proceedings of the Eighteenth International Conference on Machine Learning, pp.282-289, 2001.

. Lavergne, Practical very large scale CRFs, 48th Annual Meeting of the Association for Computational Linguistics (ACL), pp.504-513, 2010.

. Liu, Clustering through decision tree construction, Proceedings of the ninth international conference on Information and knowledge management , CIKM '00, pp.20-29, 2000.
DOI : 10.1145/354756.354775

. Mporas, Comparison of Speech Features on the Speech Recognition Task, Journal of Computer Science, vol.3, issue.8, pp.608-616, 2007.
DOI : 10.3844/jcssp.2007.608.616

. Muscariello, Audio keyword extraction by unsupervised word discovery, INTERSPEECH 2009: 10th Annual Conference of the International Speech Communication Association, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00551769

. Park, . Glass, A. Park, and J. Glass, Unsupervised Pattern Discovery in Speech, IEEE Transactions on Audio, Speech, and Language Processing, vol.16, issue.1, pp.186-197, 2008.
DOI : 10.1109/TASL.2007.909282

. Perbet, Random Forest Clustering and Application to Video Segmentation, Procedings of the British Machine Vision Conference 2009, 2009.
DOI : 10.5244/C.23.100

. Pfitzner, Characterization and evaluation of similarity measures for pairs of clusterings, Knowledge and Information Systems, vol.8, issue.3, pp.361-394, 2009.
DOI : 10.1007/s10115-008-0150-6

W. Rand, Objective Criteria for the Evaluation of Clustering Methods, Journal of the American Statistical Association, vol.15, issue.336, pp.66846-850, 1971.
DOI : 10.1080/01621459.1963.10500845

. Rui, I. Wunsch, X. Rui, I. Wunsch, and D. , Survey of clustering algorithms, IEEE Transactions on Neural Networks, vol.16, issue.3, pp.645-678, 2005.

. Shi, . Horvath, T. Shi, and S. Horvath, Unsupervised Learning With Random Forest Predictors, Journal of Computational and Graphical Statistics, vol.15, issue.1, 2005.
DOI : 10.1198/106186006X94072

. Shi, . Horvath, T. Shi, and S. Horvath, Unsupervised Learning With Random Forest Predictors, Journal of Computational and Graphical Statistics, vol.15, issue.1, pp.118-138, 2006.
DOI : 10.1198/106186006X94072

S. Van-dongen, Graph clustering by flow simulation, 2000.