Dealing with highly imbalanced textual data gathered into similar classes

Jean-Charles Lamirel 1
1 SYNALP - Natural Language Processing : representations, inference and semantics
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This paper deals with a new feature selection and feature contrasting approach for classification of highly imbalanced textual data with a high degree of similarity between associated classes. An example of such classification context is illustrated by the task of classifying bibliographic references into a patent classification scheme. This task represents one of the domains of investigation of the QUAERO project, with the final goal of helping experts to evaluate upcoming patents through the use of related research.
Type de document :
Communication dans un congrès
IJCNN - 2013 International Joint Conference on Neural Networks, Aug 2013, Dallas, United States. 2013, 〈10.1109/IJCNN.2013.6707044〉
Liste complète des métadonnées

https://hal.inria.fr/hal-00939036
Contributeur : Jean-Charles Lamirel <>
Soumis le : jeudi 30 janvier 2014 - 07:15:11
Dernière modification le : mardi 24 avril 2018 - 13:33:07

Identifiants

Collections

Citation

Jean-Charles Lamirel. Dealing with highly imbalanced textual data gathered into similar classes. IJCNN - 2013 International Joint Conference on Neural Networks, Aug 2013, Dallas, United States. 2013, 〈10.1109/IJCNN.2013.6707044〉. 〈hal-00939036〉

Partager

Métriques

Consultations de la notice

259