Text/graphic separation using a sparse representation with multi-learned dictionaries - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Text/graphic separation using a sparse representation with multi-learned dictionaries

Résumé

In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds.
Fichier principal
Vignette du fichier
DO_extractText_SparseRepresentation.pdf (357.98 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00759554 , version 1 (04-12-2012)

Identifiants

  • HAL Id : hal-00759554 , version 1

Citer

Thanh Ha Do, Salvatore Tabbone, Oriol Ramos Terrades. Text/graphic separation using a sparse representation with multi-learned dictionaries. 21st International Conference on Pattern Recognition - ICPR 2012, Nov 2012, Tsukuba, Japan. ⟨hal-00759554⟩
203 Consultations
278 Téléchargements

Partager

Gmail Facebook X LinkedIn More