Whole is Greater than Sum of Parts: Recognizing Scene Text Words

Vibhor Goel; Anand Mishra; Karteek Alahari; C. V. Jawahar

Communication Dans Un Congrès Année : 2013

Whole is Greater than Sum of Parts: Recognizing Scene Text Words

(1) , (1) , (2, 3) , (1)

1
2
3

Vibhor Goel

Fonction : Auteur

Center for Visual Information Technology [Hyderabad]

Anand Mishra

Fonction : Auteur

Center for Visual Information Technology [Hyderabad]

Karteek Alahari

Fonction : Auteur
PersonId : 19670
IdHAL : karteek
ORCID : 0000-0002-1838-5936
IdRef : 196283892

Laboratoire d'informatique de l'école normale supérieure

Models of visual object recognition and scene understanding

C. V. Jawahar

Fonction : Auteur

Center for Visual Information Technology [Hyderabad]

Résumé

Recognizing text in images taken in the wild is a challenging problem that has received great attention in recent years. Previous methods addressed this problem by first detecting individual characters, and then forming them into words. Such approaches often suffer from weak character detections, due to large intra-class variations, even more so than characters from scanned documents. We take a different view of the problem and present a holistic word recognition framework. In this, we first represent the scene text image and synthetic images generated from lexicon words using gradient-based features. We then recognize the text in the image by matching the scene and synthetic image features with our novel weighted Dynamic Time Warping (wDTW) approach. We perform experimental analysis on challenging public datasets, such as Street View Text and ICDAR 2003. Our proposed method significantly outperforms our earlier work in Mishra et al. (CVPR 2012), as well as many other recent works, such as Novikova et al. (ECCV 2012), Wang et al. (ICPR 2012), Wang et al. (ICCV 2011).

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

goelICDAR13.pdf (510.18 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Suha Kwak : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01064766

Soumis le : mercredi 17 septembre 2014-11:31:12

Dernière modification le : vendredi 19 avril 2024-16:18:55

Archivage à long terme le : jeudi 18 décembre 2014-10:31:43

Dates et versions

hal-01064766 , version 1 (17-09-2014)

Identifiants

HAL Id : hal-01064766 , version 1

Citer

Vibhor Goel, Anand Mishra, Karteek Alahari, C. V. Jawahar. Whole is Greater than Sum of Parts: Recognizing Scene Text Words. International Conference on Document Analysis and Recognition, Aug 2013, Washington DC, United States. ⟨hal-01064766⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 PSL

229 Consultations

720 Téléchargements

Whole is Greater than Sum of Parts: Recognizing Scene Text Words

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager