Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Efficient Version Space Algorithms for "Human-in-the-Loop" Model Development

Abstract : When active learning (AL) is applied to help the user develop a model on a large dataset through interactively presenting data instances for labeling, existing AL techniques can suffer from two main drawbacks: first, they may require hundreds of labeled data instances in order to reach high accuracy; second, retrieving the next instance to label can be time consuming, making it incompatible with the interactive nature of the human exploration process. To address these issues, we introduce a novel version space based AL algorithm for kernel classifiers, which not only has strong theoretical guarantees on performance, but also allows for an efficient implementation in time and space. In addition, by leveraging additional insights obtained in the user labeling process, we are able to factorize the version space to perform active learning in a set of subspaces, which further reduces the user labeling effort. Evaluation results show that our algorithms significantly outperform state-of-theart version space algorithms, as well as a recent factorization-aware algorithm, for model development over large data sets.
Complete list of metadatas

https://hal.inria.fr/hal-03064769
Contributor : Luciano Di Palma <>
Submitted on : Monday, December 14, 2020 - 3:22:11 PM
Last modification on : Tuesday, December 15, 2020 - 4:04:16 AM

File

Submission-2020-09.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03064769, version 1

Collections

Citation

Luciano Palma, Yanlei Diao, Anna Liu. Efficient Version Space Algorithms for "Human-in-the-Loop" Model Development. 2020. ⟨hal-03064769⟩

Share

Metrics

Record views

16

Files downloads

51