Top-Down and Bottom-Up Cues for Scene Text Recognition

Anand Mishra 1 Karteek Alahari 2, 3 C.V. Jawahar 1
3 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique de l'École normale supérieure, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : Scene text recognition has gained significant attention from the computer vision community in recent years. Recognizing such text is a challenging problem, even more so than the recognition of scanned documents. In this work, we focus on the problem of recognizing text extracted from street images. We present a framework that exploits both bottom-up and top-down cues. The bottom-up cues are derived from individual character detections from the image. We build a Conditional Random Field model on these detections to jointly model the strength of the detections and the interactions between them. We impose top-down cues obtained from a lexicon-based prior, i.e. language statistics, on the model. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random field model. We show significant improvements in accuracies on two challenging public datasets, namely Street View Text (over 15%) and ICDAR 2003 (nearly 10%).
Type de document :
Communication dans un congrès
CVPR - IEEE Conference on Computer Vision and Pattern Recognition, Jun 2012, Providence, United States. IEEE, 2012, 〈10.1109/CVPR.2012.6247990〉
Liste complète des métadonnées

Littérature citée [27 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00818178
Contributeur : Karteek Alahari <>
Soumis le : lundi 3 novembre 2014 - 11:07:28
Dernière modification le : lundi 28 mai 2018 - 15:10:03

Annexe

mishra12.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Anand Mishra, Karteek Alahari, C.V. Jawahar. Top-Down and Bottom-Up Cues for Scene Text Recognition. CVPR - IEEE Conference on Computer Vision and Pattern Recognition, Jun 2012, Providence, United States. IEEE, 2012, 〈10.1109/CVPR.2012.6247990〉. 〈hal-00818178〉

Partager

Métriques

Consultations de la notice

629

Téléchargements de fichiers

986