Communication-efficient Federated Learning through Clustering optimization - Archive ouverte HAL Access content directly
Conference Papers Year :

Communication-efficient Federated Learning through Clustering optimization

(1, 2) , (1) , (2)
1
2

Abstract

We study the problem of model personalization in Federated Learning (FL) with non-IID (Independent and Identically Distributed) data collected at nodes in a network, under the network communication cost constraints. Classical FL collaboratively trains a unique global model. If data is statistically heterogenic (non-IID), personalized models for groups of nodes with similar statistics have been shown to provide better performances compared to FL [1]. We propose a Clustered Federated Learning approach that provides a trade-off between identifying models that are more adapted to nodes locally, under communication cost constraints. Our method identifies clusters of nodes with similar data statistics, which improves the local model accuracy. In particular, it aims at finding the cluster structure, cluster heads and a set of model weights (one per cluster) that minimize an objective function composed of two terms: a classical multi-task optimization term and a communication cost regularization. Local model updates represent proxy values of the local data distributions (statistically similar train sets have similar updates) where similar updates are aggregated together [2,3,4]. Our algorithm has two phases: initialization and cluster optimization. During the initialization, nodes collaboratively train a global initial model. The cluster head nodes are identified and nodes are clustered based only on the communication cost minimization [5]. The cluster optimization phase starts by applying the Hierarchical Agglomerative Clustering on a distance metric composed of two terms: the cosine dissimilarity between the locally computed model updates of two nodes, and the communication cost of grouping two nodes in the same cluster. In parallel, respective cluster heads are also optimized. The clusters are organized in a tree hierarchy. At each round, the cluster heads verify if a new cluster optimization is needed based on the model update values. If required, the same method is applied to further create sub-clusters. We evaluate our method on several non-IID settings generated from MNIST dataset, while simulating the communication cost at each round. We show that our algorithm improves the quantity of nodes reaching 99% of accuracy (from 48% to 72%) and can reduce the overall communication cost by 35%. Finally, it is able to adapt the cluster structure in case of new conditions (new network nodes or time-evolution of local data distribution) by a tree structure search.
Fichier principal
Vignette du fichier
Abstract_vfinale.pdf (175.98 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03479640 , version 1 (17-12-2021)

Identifiers

  • HAL Id : hal-03479640 , version 1

Cite

Hugo Miralles, Tamara Tosic, Michel Riveill. Communication-efficient Federated Learning through Clustering optimization. SophI.A. Summit, Nov 2021, Biot, France. ⟨hal-03479640⟩
117 View
64 Download

Share

Gmail Facebook Twitter LinkedIn More