Top-Down and Bottom-Up Cues for Scene Text Recognition

Anand Mishra; Karteek Alahari; C.V. Jawahar

doi:10.1109/CVPR.2012.6247990

Communication Dans Un Congrès Année : 2012

Top-Down and Bottom-Up Cues for Scene Text Recognition

(1) , (2, 3) , (1)

1
2
3

Anand Mishra

Fonction : Auteur

Center for Visual Information Technology [Hyderabad]

Karteek Alahari

Fonction : Auteur
PersonId : 19670
IdHAL : karteek
ORCID : 0000-0002-1838-5936
IdRef : 196283892

Laboratoire d'informatique de l'école normale supérieure

Models of visual object recognition and scene understanding

C.V. Jawahar

Fonction : Auteur

Center for Visual Information Technology [Hyderabad]

Résumé

Scene text recognition has gained significant attention from the computer vision community in recent years. Recognizing such text is a challenging problem, even more so than the recognition of scanned documents. In this work, we focus on the problem of recognizing text extracted from street images. We present a framework that exploits both bottom-up and top-down cues. The bottom-up cues are derived from individual character detections from the image. We build a Conditional Random Field model on these detections to jointly model the strength of the detections and the interactions between them. We impose top-down cues obtained from a lexicon-based prior, i.e. language statistics, on the model. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random field model. We show significant improvements in accuracies on two challenging public datasets, namely Street View Text (over 15%) and ICDAR 2003 (nearly 10%).

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

mishra12.pdf (398.94 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Karteek Alahari : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00818178

Soumis le : lundi 3 novembre 2014-11:07:28

Dernière modification le : vendredi 19 avril 2024-16:18:55

Dates et versions

hal-00818178 , version 1 (03-11-2014)

Identifiants

HAL Id : hal-00818178 , version 1
DOI : 10.1109/CVPR.2012.6247990

Citer

Anand Mishra, Karteek Alahari, C.V. Jawahar. Top-Down and Bottom-Up Cues for Scene Text Recognition. CVPR - IEEE Conference on Computer Vision and Pattern Recognition, Jun 2012, Providence, United States. ⟨10.1109/CVPR.2012.6247990⟩. ⟨hal-00818178⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 PSL

449 Consultations

1527 Téléchargements

Top-Down and Bottom-Up Cues for Scene Text Recognition

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager