Skip to Main content Skip to Navigation
Journal articles

Scaling Analysis of Affinity Propagation

Cyril Furtlehner 1 Michèle Sebag 1 Zhang Xiangliang 1
1 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : We analyze and exploit some scaling properties of the {\em Affinity Propagation} (AP) clustering algorithm proposed by Frey and Dueck (2007). Following a divide and conquer strategy we setup an exact renormalization-based approach to address the question of clustering consistency, in particular, how many cluster are present in a given data set. We first observe that the divide and conquer strategy, used on a large data set hierarchically reduces the complexity ${\cal O}(N^2)$ to ${\cal O}(N^{(h+2)/(h+1)})$, for a data-set of size $N$ and a depth $h$ of the hierarchical strategy. For a data-set embedded in a $d$-dimensional space, we show that this is obtained without notably damaging the precision except in dimension $d=2$. In fact, for $d$ larger than $2$ the relative loss in precision scales like $N^{(2-d)/(h+1)d}$. Finally, under some conditions we observe that there is a value $s^*$ of the penalty coefficient, a free parameter used to fix the number of clusters, which separates a fragmentation phase (for $ss^*$) of the underlying hidden cluster structure. At this precise point holds a self-similarity property which can be exploited by the hierarchical strategy to actually locate its position, as a result of an exact decimation procedure. From this observation, a strategy based on \AP\ can be defined to find out how many clusters are present in a given dataset.
Complete list of metadata

Cited literature [18 references]  Display  Hide  Download
Contributor : Cyril Furtlehner Connect in order to contact the contributor
Submitted on : Wednesday, June 9, 2010 - 2:33:24 PM
Last modification on : Thursday, July 8, 2021 - 3:49:23 AM
Long-term archiving on: : Thursday, September 23, 2010 - 12:37:59 PM


Files produced by the author(s)




Cyril Furtlehner, Michèle Sebag, Zhang Xiangliang. Scaling Analysis of Affinity Propagation. Physical Review E : Statistical, Nonlinear, and Soft Matter Physics, American Physical Society, 2010, 81 (6), pp.066102. ⟨10.1103/PhysRevE.81.066102⟩. ⟨inria-00420407v2⟩



Les métriques sont temporairement indisponibles