# Correlation Clustering with Adaptive Similarity Queries

4 MAGNET - Machine Learning in Information Networks
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : In correlation clustering, we are given $n$ objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disagreements and the total number of queries. On the one hand, we describe simple active learning algorithms, which provably achieve an almost optimal trade-off while giving cluster recovery guarantees, and we test them on different datasets. On the other hand, we prove information-theoretical bounds on the number of queries necessary to guarantee a prescribed disagreement bound. These results give a rich characterization of the trade-off between queries and clustering error.
Document type :
Conference papers

Cited literature [16 references]

https://hal.inria.fr/hal-02376961
Contributor : Team Magnet <>
Submitted on : Friday, November 22, 2019 - 7:16:27 PM
Last modification on : Tuesday, November 26, 2019 - 1:35:44 AM

### File

1905.11902.pdf
Files produced by the author(s)

### Identifiers

• HAL Id : hal-02376961, version 1
• ARXIV : 1905.11902

### Citation

Marco Bressan, Nicolo Cesa-Bianchi, Andrea Paudice, Fabio Vitale. Correlation Clustering with Adaptive Similarity Queries. Conference on Neural Information Processing Systems, Dec 2019, Vancouver, Canada. ⟨hal-02376961⟩

Record views