Complexity Bounds for Batch Active Learning in Classification

Philippe Rolet; Olivier Teytaud

Conference Papers Year : 2010

Complexity Bounds for Batch Active Learning in Classification

(1) , (1, 2)

1
2

Philippe Rolet

Function : Author
PersonId : 858981

Laboratoire de Recherche en Informatique

Olivier Teytaud

Function : Author
PersonId : 581
IdHAL : olivier-teytaud
IdRef : 05971008X

Laboratoire de Recherche en Informatique

Machine Learning and Optimisation

Abstract

Active learning is a branch of Machine Learning in which the learning algorithm, instead of being directly provided with pairs of problem instances and their solutions (their labels), is allowed to choose, from a set of unlabeled data, which instances to query. It is suited to settings where labeling instances is costly. This paper analyzes the speed-up of batch (parallel) active learning compared to sequential active learning (where instances are chosen 1 by 1): how faster can an algorithm become if it can query instances at once? There are two main contributions: proving lower and upper bounds on the possible gain, and illustrating them by experimenting on usual active learning algorithms. Roughly speaking, the speed-up is asymptotically logarithmic in the batch size (i.e. when ! 1). However, for some classes of functions with finite VC-dimension V , a linear speed-up can be achieved until a batch size of V . Practically speaking, this means that parallelizing computations on an expensive-to-label problem which is suited to active learning is very beneficial until V simultaneous queries, and less interesting (yet still bringing improvement) afterwards.

Domains

Machine Learning [stat.ML] Machine Learning [cs.LG]

Fichier principal

batchal.pdf (158.41 Ko)

Origin : Files produced by the author(s)

Philippe Rolet : Connect in order to contact the contributor

https://inria.hal.science/inria-00533318

Submitted on : Friday, November 5, 2010-4:40:00 PM

Last modification on : Monday, February 12, 2024-9:48:04 AM

Long-term archiving on: Sunday, February 6, 2011-3:00:42 AM

Dates and versions

inria-00533318 , version 1 (05-11-2010)

Identifiers

HAL Id : inria-00533318 , version 1

Cite

Philippe Rolet, Olivier Teytaud. Complexity Bounds for Batch Active Learning in Classification. ECML 2010, Oct 2010, Barcelone, Spain. ⟨inria-00533318⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS CNRS INRIA UMR8623 INRIA2 LRI-AO UNIV-PARIS-SACLAY

148 View

164 Download

Complexity Bounds for Batch Active Learning in Classification

Abstract

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share