Complexity Bounds for Batch Active Learning in Classification

Philippe Rolet; Olivier Teytaud

Communication Dans Un Congrès Année : 2010

Complexity Bounds for Batch Active Learning in Classification

(1) , (1, 2)

1
2

Philippe Rolet

Fonction : Auteur
PersonId : 858981

Laboratoire de Recherche en Informatique

Olivier Teytaud

Fonction : Auteur
PersonId : 581
IdHAL : olivier-teytaud
IdRef : 05971008X

Laboratoire de Recherche en Informatique

Machine Learning and Optimisation

Résumé

Active learning is a branch of Machine Learning in which the learning algorithm, instead of being directly provided with pairs of problem instances and their solutions (their labels), is allowed to choose, from a set of unlabeled data, which instances to query. It is suited to settings where labeling instances is costly. This paper analyzes the speed-up of batch (parallel) active learning compared to sequential active learning (where instances are chosen 1 by 1): how faster can an algorithm become if it can query instances at once? There are two main contributions: proving lower and upper bounds on the possible gain, and illustrating them by experimenting on usual active learning algorithms. Roughly speaking, the speed-up is asymptotically logarithmic in the batch size (i.e. when ! 1). However, for some classes of functions with finite VC-dimension V , a linear speed-up can be achieved until a batch size of V . Practically speaking, this means that parallelizing computations on an expensive-to-label problem which is suited to active learning is very beneficial until V simultaneous queries, and less interesting (yet still bringing improvement) afterwards.

Domaines

Machine Learning [stat.ML] Apprentissage [cs.LG]

Fichier principal

batchal.pdf (158.41 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Philippe Rolet : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00533318

Soumis le : vendredi 5 novembre 2010-16:40:00

Dernière modification le : lundi 12 février 2024-09:48:04

Archivage à long terme le : dimanche 6 février 2011-03:00:42

Dates et versions

inria-00533318 , version 1 (05-11-2010)

Identifiants

HAL Id : inria-00533318 , version 1

Citer

Philippe Rolet, Olivier Teytaud. Complexity Bounds for Batch Active Learning in Classification. ECML 2010, Oct 2010, Barcelone, Spain. ⟨inria-00533318⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS CNRS INRIA UMR8623 INRIA2 LRI-AO UNIV-PARIS-SACLAY

148 Consultations

164 Téléchargements

Complexity Bounds for Batch Active Learning in Classification

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager