Skip to Main content Skip to Navigation
Conference papers

SAPKOS: Experimental Czech Multi-label Document Classification and Analysis System

Abstract : This paper presents an experimental multi-label document classification and analysis system called SAPKOS. The system which integrates the state-of-the-art machine learning and natural language processing approaches is intended to be used by the Czech news Agency (ČTK). Its main purpose is to save human resources in the task of annotation of newspaper articles with topics. Another important functionality is automatic comparison of the ČTK production with popular Czech media. The results of this analysis will be used to adapt the ČTK production to better correspond to the today’s market requirements. An interesting contribution is that, to the best of our knowledge, no other automatic Czech document classification system exists. It is also worth mentioning that the system accuracy is very high. This score is obtained due to the unique system architecture which integrates a maximum entropy based classification engine with the novel confidence measure method.
Document type :
Conference papers
Complete list of metadata

Cited literature [34 references]  Display  Hide  Download
Contributor : Hal Ifip Connect in order to contact the contributor
Submitted on : Friday, October 21, 2016 - 11:43:36 AM
Last modification on : Thursday, March 5, 2020 - 5:41:08 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License



Ladislav Lenc, Pavel Král. SAPKOS: Experimental Czech Multi-label Document Classification and Analysis System. 11th IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI 2015), Sep 2015, Bayonne, France. pp.337-350, ⟨10.1007/978-3-319-23868-5_24⟩. ⟨hal-01385368⟩



Les métriques sont temporairement indisponibles