A Factorized Version Space Algorithm for "Human-In-the-Loop" Data Exploration - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

A Factorized Version Space Algorithm for "Human-In-the-Loop" Data Exploration

Résumé

While active learning (AL) has been recently applied to help the user explore a large database to retrieve data instances of interest, existing methods often require a large number of instances to be labeled in order to achieve good accuracy. To address this slow convergence problem, our work augments version space-based AL algorithms, which have strong theoretical results on convergence but are very costly to run, with additional insights obtained in the user labeling process. These insights lead to a novel algorithm that factorizes the version space to perform active learning in a set of subspaces. Our work offers theoretical results on optimality and approximation for this algorithm, as well as optimizations for better performance. Evaluation results show that our factorized version space algorithm significantly outperforms other version space algorithms, as well as a recent factorization-aware algorithm, for large database exploration.
Fichier principal
Vignette du fichier
tech_report.pdf (386.51 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02274497 , version 1 (29-08-2019)
hal-02274497 , version 2 (03-09-2019)

Identifiants

  • HAL Id : hal-02274497 , version 1

Citer

Luciano Di Palma, Yanlei Diao, Anna Liu. A Factorized Version Space Algorithm for "Human-In-the-Loop" Data Exploration. 19th IEEE International Conference in Data Mining, Nov 2019, Beijing, China. ⟨hal-02274497v1⟩
139 Consultations
328 Téléchargements

Partager

Gmail Facebook X LinkedIn More