Optimization for active learning-based interactive database exploration

Enhui Huang 1, 2 Liping Peng 3 Luciano Di Palma 1, 2 Ahmed Abdelkafi 2 Anna Liu 3 Yanlei Diao 1, 3, 2
2 CEDAR - Rich Data Analytics at Cloud Scale
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France
Abstract : There is an increasing gap between fast growth of data and limited human ability to comprehend data. Consequently, there has been a growing demand of data management tools that can bridge this gap and help the user retrieve high-value content from data more e↵ectively. In this work, we aim to build interactive data exploration as a new database service, using an approach called "explore-by-example". In particular, we cast the explore-by-example problem in a principled "active learning" framework, and bring the properties of important classes of database queries to bear on the design of new algorithms and optimizations for active learning-based database exploration. These new techniques allow the database system to overcome a fundamental limitation of traditional active learning, i.e., the slow convergence problem. Evaluation results using real-world datasets and user interest patterns show that our new system significantly outperforms state-of-the-art active learning techniques and data exploration systems in accuracy while achieving desired eciency for interactive performance.
Complete list of metadatas

Cited literature [44 references]  Display  Hide  Download

https://hal.inria.fr/hal-01969886
Contributor : Enhui Huang <>
Submitted on : Monday, January 7, 2019 - 1:08:41 PM
Last modification on : Tuesday, August 6, 2019 - 11:38:50 AM
Long-term archiving on : Monday, April 8, 2019 - 3:04:12 PM

File

p71-huang.pdf
Files produced by the author(s)

Identifiers

Citation

Enhui Huang, Liping Peng, Luciano Di Palma, Ahmed Abdelkafi, Anna Liu, et al.. Optimization for active learning-based interactive database exploration. Proceedings of the VLDB Endowment (PVLDB), VLDB Endowment, 2018, 12 (1), pp.71-84. ⟨10.14778/3275536.3275542⟩. ⟨hal-01969886⟩

Share

Metrics

Record views

139

Files downloads

513