Clustering of Multiple Dissimilarity Data Tables for Documents Categorization

Abstract : This paper introduces a clustering algorithm that is able to partition objects taking into account simultaneously their relational descriptions given by multiple dissimilarity matrices. These matrices could have been generated using different sets of variables and a fixed dissimilarity function, using a fixed set of variables and different dissimilarity functions or using different sets of variables and dissimilarity functions. This method, which is based on the dynamic hard clustering algorithm for relational data, is designed to provided a partition and a prototype for each cluster as well as to learn a relevance weight for each dissimilarity matrix by optimizing an adequacy criterion that measures the fit between clusters and their representatives. These relevance weights change at each algorithm iteration and are different from one cluster to another. Experiments aiming at obtaining a categorization of a document data base demonstrate the usefulness of this partitional clustering method.
Type de document :
Communication dans un congrès
COMPSTAT 2010 - 19th International Conference on Computational Statistics, Aug 2010, Paris, France. Physica-Verlag, pp.1263-1270, 2010, 〈10.1007/978-3-7908-2604-3〉
Liste complète des métadonnées

Littérature citée [9 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00586225
Contributeur : Thierry Despeyroux <>
Soumis le : vendredi 15 avril 2011 - 14:06:36
Dernière modification le : mardi 17 avril 2018 - 11:32:18
Document(s) archivé(s) le : jeudi 8 novembre 2012 - 16:36:33

Fichier

Yves-Francisco-COMPSTAT2010.pd...
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Yves Lechevallier, Francisco De Carvalho, Thierry Despeyroux, Filipe De Melo. Clustering of Multiple Dissimilarity Data Tables for Documents Categorization. COMPSTAT 2010 - 19th International Conference on Computational Statistics, Aug 2010, Paris, France. Physica-Verlag, pp.1263-1270, 2010, 〈10.1007/978-3-7908-2604-3〉. 〈inria-00586225〉

Partager

Métriques

Consultations de la notice

296

Téléchargements de fichiers

439