Complexity Bounds for Batch Active Learning in Classification

Philippe Rolet 1 Olivier Teytaud 1, 2
2 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : Active learning is a branch of Machine Learning in which the learning algorithm, instead of being directly provided with pairs of problem instances and their solutions (their labels), is allowed to choose, from a set of unlabeled data, which instances to query. It is suited to settings where labeling instances is costly. This paper analyzes the speed-up of batch (parallel) active learning compared to sequential active learning (where instances are chosen 1 by 1): how faster can an algorithm become if it can query instances at once? There are two main contributions: proving lower and upper bounds on the possible gain, and illustrating them by experimenting on usual active learning algorithms. Roughly speaking, the speed-up is asymptotically logarithmic in the batch size (i.e. when ! 1). However, for some classes of functions with finite VC-dimension V , a linear speed-up can be achieved until a batch size of V . Practically speaking, this means that parallelizing computations on an expensive-to-label problem which is suited to active learning is very beneficial until V simultaneous queries, and less interesting (yet still bringing improvement) afterwards.
Type de document :
Communication dans un congrès
ECML 2010, Oct 2010, Barcelone, Spain. 2010
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00533318
Contributeur : Philippe Rolet <>
Soumis le : vendredi 5 novembre 2010 - 16:40:00
Dernière modification le : jeudi 5 avril 2018 - 12:30:12
Document(s) archivé(s) le : dimanche 6 février 2011 - 03:00:42

Fichier

batchal.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00533318, version 1

Collections

Citation

Philippe Rolet, Olivier Teytaud. Complexity Bounds for Batch Active Learning in Classification. ECML 2010, Oct 2010, Barcelone, Spain. 2010. 〈inria-00533318〉

Partager

Métriques

Consultations de la notice

283

Téléchargements de fichiers

151