Rough Sets in Imbalanced Data Problem: Improving Re–sampling Process

Abstract : Imbalanced data problem is still one of the most interesting and important research subjects. The latest experiments and detailed analysis revealed that not only the underrepresented classes are the main cause of performance loss in machine learning process, but also the inherent complex characteristics of data. The list of discovered significant difficulty factors consists of the phenomena like class overlapping, decomposition of the minority class, presence of noise and outliers. Although there are numerous solutions proposed, it is still unclear how to deal with all of these issues together and correctly evaluate the class distribution to select a proper treatment (especially considering the real–world applications where levels of uncertainty are eminently high). Since applying rough sets theory to the imbalanced data learning problem could be a promising research direction, the improved re–sampling approach combining selective preprocessing and editing techniques is introduced in this paper. The novel technique allows both qualitative and quantitative data handling.
Type de document :
Communication dans un congrès
Khalid Saeed; Władysław Homenda; Rituparna Chaki. 16th IFIP International Conference on Computer Information Systems and Industrial Management (CISIM), Jun 2017, Bialystok, Poland. Springer International Publishing, Lecture Notes in Computer Science, LNCS-10244, pp.459-469, 2017, Computer Information Systems and Industrial Management. 〈10.1007/978-3-319-59105-6_39〉
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01656246
Contributeur : Hal Ifip <>
Soumis le : mardi 5 décembre 2017 - 14:58:50
Dernière modification le : mercredi 6 décembre 2017 - 01:21:00

Fichier

 Accès restreint
Fichier visible le : 2020-01-01

Connectez-vous pour demander l'accès au fichier

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Citation

Katarzyna Borowska, Jarosław Stepaniuk. Rough Sets in Imbalanced Data Problem: Improving Re–sampling Process. Khalid Saeed; Władysław Homenda; Rituparna Chaki. 16th IFIP International Conference on Computer Information Systems and Industrial Management (CISIM), Jun 2017, Bialystok, Poland. Springer International Publishing, Lecture Notes in Computer Science, LNCS-10244, pp.459-469, 2017, Computer Information Systems and Industrial Management. 〈10.1007/978-3-319-59105-6_39〉. 〈hal-01656246〉

Partager

Métriques

Consultations de la notice

19