Scaling Analysis of Affinity Propagation

Cyril Furtlehner 1 Michèle Sebag 1 Zhang Xiangliang 1
1 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : We analyze and exploit some scaling properties of the {\em Affinity Propagation} (AP) clustering algorithm proposed by Frey and Dueck (2007). Following a divide and conquer strategy we setup an exact renormalization-based approach to address the question of clustering consistency, in particular, how many cluster are present in a given data set. We first observe that the divide and conquer strategy, used on a large data set hierarchically reduces the complexity ${\cal O}(N^2)$ to ${\cal O}(N^{(h+2)/(h+1)})$, for a data-set of size $N$ and a depth $h$ of the hierarchical strategy. For a data-set embedded in a $d$-dimensional space, we show that this is obtained without notably damaging the precision except in dimension $d=2$. In fact, for $d$ larger than $2$ the relative loss in precision scales like $N^{(2-d)/(h+1)d}$. Finally, under some conditions we observe that there is a value $s^*$ of the penalty coefficient, a free parameter used to fix the number of clusters, which separates a fragmentation phase (for $ss^*$) of the underlying hidden cluster structure. At this precise point holds a self-similarity property which can be exploited by the hierarchical strategy to actually locate its position, as a result of an exact decimation procedure. From this observation, a strategy based on \AP\ can be defined to find out how many clusters are present in a given dataset.
Type de document :
Article dans une revue
Physical Review E : Statistical, Nonlinear, and Soft Matter Physics, American Physical Society, 2010, 81 (6), pp.066102. 〈10.1103/PhysRevE.81.066102〉
Liste complète des métadonnées

Littérature citée [18 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00420407
Contributeur : Cyril Furtlehner <>
Soumis le : mercredi 9 juin 2010 - 14:33:24
Dernière modification le : jeudi 11 janvier 2018 - 06:22:14
Document(s) archivé(s) le : jeudi 23 septembre 2010 - 12:37:59

Fichier

pre_v2.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Cyril Furtlehner, Michèle Sebag, Zhang Xiangliang. Scaling Analysis of Affinity Propagation. Physical Review E : Statistical, Nonlinear, and Soft Matter Physics, American Physical Society, 2010, 81 (6), pp.066102. 〈10.1103/PhysRevE.81.066102〉. 〈inria-00420407v2〉

Partager

Métriques

Consultations de la notice

368

Téléchargements de fichiers

121