Text/graphic separation using a sparse representation with multi-learned dictionaries

Thanh Ha Do 1, * Salvatore Tabbone 1 Oriol Ramos Terrades 2
* Auteur correspondant
1 QGAR - Querying Graphics through Analysis and Recognition
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds.
Type de document :
Communication dans un congrès
21st International Conference on Pattern Recognition - ICPR 2012, Nov 2012, Tsukuba, Japan. 2012
Liste complète des métadonnées

Littérature citée [8 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00759554
Contributeur : Thanh Ha Do <>
Soumis le : mardi 4 décembre 2012 - 09:15:57
Dernière modification le : mardi 24 avril 2018 - 13:36:25
Document(s) archivé(s) le : mardi 5 mars 2013 - 03:49:25

Fichier

DO_extractText_SparseRepresent...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00759554, version 1

Collections

Citation

Thanh Ha Do, Salvatore Tabbone, Oriol Ramos Terrades. Text/graphic separation using a sparse representation with multi-learned dictionaries. 21st International Conference on Pattern Recognition - ICPR 2012, Nov 2012, Tsukuba, Japan. 2012. 〈hal-00759554〉

Partager

Métriques

Consultations de la notice

286

Téléchargements de fichiers

301