Imbalanced Data Classification: A Novel Re-sampling Approach Combining Versatile Improved SMOTE and Rough Sets

Abstract : In recent years, the problem of learning from imbalanced data has emerged as important and challenging. The fact that one of the classes is underrepresented in the data set is not the only reason of difficulties. The complex distribution of data, especially small disjuncts, noise and class overlapping, contributes to the significant depletion of classifier’s performance. Hence, the numerous solutions were proposed. They are categorized into three groups: data-level techniques, algorithm-level methods and cost-sensitive approaches. This paper presents a novel data-level method combining Versatile Improved SMOTE and rough sets. The algorithm was applied to the two-class problems, data sets were characterized by the nominal attributes. We evaluated the proposed technique in comparison with other preprocessing methods. The impact of the additional cleaning phase was specifically verified.
Type de document :
Communication dans un congrès
Khalid Saeed; Władysław Homenda. 15th IFIP International Conference on Computer Information Systems and Industrial Management (CISIM), Sep 2016, Vilnius, Lithuania. Springer International Publishing, Lecture Notes in Computer Science, LNCS-9842, pp.31-42, 2016, Computer Information Systems and Industrial Management. 〈10.1007/978-3-319-45378-1_4〉
Liste complète des métadonnées

Littérature citée [22 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01637478
Contributeur : Hal Ifip <>
Soumis le : vendredi 17 novembre 2017 - 15:44:16
Dernière modification le : samedi 18 novembre 2017 - 01:16:41
Document(s) archivé(s) le : dimanche 18 février 2018 - 16:12:13

Fichier

 Accès restreint
Fichier visible le : 2019-01-01

Connectez-vous pour demander l'accès au fichier

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Collections

Citation

Katarzyna Borowska, Jarosław Stepaniuk. Imbalanced Data Classification: A Novel Re-sampling Approach Combining Versatile Improved SMOTE and Rough Sets. Khalid Saeed; Władysław Homenda. 15th IFIP International Conference on Computer Information Systems and Industrial Management (CISIM), Sep 2016, Vilnius, Lithuania. Springer International Publishing, Lecture Notes in Computer Science, LNCS-9842, pp.31-42, 2016, Computer Information Systems and Industrial Management. 〈10.1007/978-3-319-45378-1_4〉. 〈hal-01637478〉

Partager

Métriques

Consultations de la notice

56