Skip to Main content Skip to Navigation
Book sections

Identifying and Mitigating Labelling Errors in Active Learning

Mohamed-Rafik Bouguelia 1 Yolande Belaïd 1 Abdel Belaïd 1
1 READ - Recognition of writing and analysis of documents
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Most existing active learning methods for classification, assume that the observed labels (i.e. given by a human labeller) are perfectly correct. However, in real world applications, the labeller is usually subject to labelling errors that reduce the classification accuracy of the learned model. In this paper, we address this issue for active learning in the streaming setting and we try to answer the following questions: (1) which labelled instances are most likely to be mislabelled? (2) is it always good to abstain from learning when data is suspected to be mislabelled? (3) which mislabelled instances require relabelling? We propose a hybrid active learning strategy based on two measures. The first measure allows to filter the potentially mislabelled instances, based on the degree of disagreement among the manually given label and the predicted class label. The second measure allows to select (for relabelling) only the most informative instances that deserve to be corrected. An instance is worth relabelling if it shows highly conflicting information among the predicted and the queried labels. Experiments on several real world data show that filtering mislabelled instances according to the first measure and relabelling few instances selected according to the second measure, greatly improves the classification accuracy of the stream-based active learning.
Complete list of metadata
Contributor : Yolande Belaid <>
Submitted on : Tuesday, March 8, 2016 - 4:29:01 PM
Last modification on : Friday, January 15, 2021 - 5:42:02 PM




Mohamed-Rafik Bouguelia, Yolande Belaïd, Abdel Belaïd. Identifying and Mitigating Labelling Errors in Active Learning. Pattern Recognition: Applications and Methods, Lecture Notes in Computer Science (9493), springer, pp.17, 2016, ⟨10.1007/978-3-319-27677-93⟩. ⟨hal-01285136⟩



Record views