Identifying and Mitigating Labelling Errors in Active Learning

Mohamed-Rafik Bouguelia 1 Yolande Belaïd 1 Abdel Belaïd 1
1 READ - Recognition of writing and analysis of documents
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Most existing active learning methods for classification, assume that the observed labels (i.e. given by a human labeller) are perfectly correct. However, in real world applications, the labeller is usually subject to labelling errors that reduce the classification accuracy of the learned model. In this paper, we address this issue for active learning in the streaming setting and we try to answer the following questions: (1) which labelled instances are most likely to be mislabelled? (2) is it always good to abstain from learning when data is suspected to be mislabelled? (3) which mislabelled instances require relabelling? We propose a hybrid active learning strategy based on two measures. The first measure allows to filter the potentially mislabelled instances, based on the degree of disagreement among the manually given label and the predicted class label. The second measure allows to select (for relabelling) only the most informative instances that deserve to be corrected. An instance is worth relabelling if it shows highly conflicting information among the predicted and the queried labels. Experiments on several real world data show that filtering mislabelled instances according to the first measure and relabelling few instances selected according to the second measure, greatly improves the classification accuracy of the stream-based active learning.
Type de document :
Chapitre d'ouvrage
Pattern Recognition: Applications and Methods, Lecture Notes in Computer Science (9493), springer, pp.17, 2016, 〈10.1007/978-3-319-27677-9 3〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01285136
Contributeur : Yolande Belaid <>
Soumis le : mardi 8 mars 2016 - 16:29:01
Dernière modification le : mardi 24 avril 2018 - 13:30:45

Identifiants

Collections

Citation

Mohamed-Rafik Bouguelia, Yolande Belaïd, Abdel Belaïd. Identifying and Mitigating Labelling Errors in Active Learning. Pattern Recognition: Applications and Methods, Lecture Notes in Computer Science (9493), springer, pp.17, 2016, 〈10.1007/978-3-319-27677-9 3〉. 〈hal-01285136〉

Partager

Métriques

Consultations de la notice

172