Dealing with highly imbalanced textual data gathered into similar classes

Jean-Charles Lamirel 1
1 SYNALP - Natural Language Processing : representations, inference and semantics
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This paper deals with a new feature selection and feature contrasting approach for classification of highly imbalanced textual data with a high degree of similarity between associated classes. An example of such classification context is illustrated by the task of classifying bibliographic references into a patent classification scheme. This task represents one of the domains of investigation of the QUAERO project, with the final goal of helping experts to evaluate upcoming patents through the use of related research.
Document type :
Conference papers
Liste complète des métadonnées

https://hal.inria.fr/hal-00939036
Contributor : Jean-Charles Lamirel <>
Submitted on : Thursday, January 30, 2014 - 7:15:11 AM
Last modification on : Tuesday, December 18, 2018 - 4:38:01 PM

Identifiers

Collections

Citation

Jean-Charles Lamirel. Dealing with highly imbalanced textual data gathered into similar classes. IJCNN - 2013 International Joint Conference on Neural Networks, Aug 2013, Dallas, United States. ⟨10.1109/IJCNN.2013.6707044⟩. ⟨hal-00939036⟩

Share

Metrics

Record views

277