Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Efficient Version Space Algorithms for "Human-in-the-Loop" Model Development

Abstract : When active learning (AL) is applied to help the user develop a model on a large dataset through interactively presenting data instances for labeling, existing AL techniques can suffer from two main drawbacks: first, they may require hundreds of labeled data instances in order to reach high accuracy; second, retrieving the next instance to label can be time consuming, making it incompatible with the interactive nature of the human exploration process. To address these issues, we introduce a novel version space based AL algorithm for kernel classifiers, which not only has strong theoretical guarantees on performance, but also allows for an efficient implementation in time and space. In addition, by leveraging additional insights obtained in the user labeling process, we are able to factorize the version space to perform active learning in a set of subspaces, which further reduces the user labeling effort. Evaluation results show that our algorithms significantly outperform state-of-theart version space algorithms, as well as a recent factorization-aware algorithm, for model development over large data sets.
Complete list of metadata
Contributor : Luciano Di Palma Connect in order to contact the contributor
Submitted on : Monday, December 14, 2020 - 3:22:11 PM
Last modification on : Wednesday, November 3, 2021 - 9:54:39 AM
Long-term archiving on: : Monday, March 15, 2021 - 7:41:16 PM


Files produced by the author(s)


  • HAL Id : hal-03064769, version 1


Luciano Palma, Yanlei Diao, Anna Liu. Efficient Version Space Algorithms for "Human-in-the-Loop" Model Development. 2020. ⟨hal-03064769⟩



Les métriques sont temporairement indisponibles