Skip to Main content Skip to Navigation
Conference papers

Whole is Greater than Sum of Parts: Recognizing Scene Text Words

Vibhor Goel 1 Anand Mishra 1 Karteek Alahari 2, 3 C. V. Jawahar 1
3 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique de l'École normale supérieure, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : Recognizing text in images taken in the wild is a challenging problem that has received great attention in recent years. Previous methods addressed this problem by first detecting individual characters, and then forming them into words. Such approaches often suffer from weak character detections, due to large intra-class variations, even more so than characters from scanned documents. We take a different view of the problem and present a holistic word recognition framework. In this, we first represent the scene text image and synthetic images generated from lexicon words using gradient-based features. We then recognize the text in the image by matching the scene and synthetic image features with our novel weighted Dynamic Time Warping (wDTW) approach. We perform experimental analysis on challenging public datasets, such as Street View Text and ICDAR 2003. Our proposed method significantly outperforms our earlier work in Mishra et al. (CVPR 2012), as well as many other recent works, such as Novikova et al. (ECCV 2012), Wang et al. (ICPR 2012), Wang et al. (ICCV 2011).
Document type :
Conference papers
Complete list of metadatas

Cited literature [17 references]  Display  Hide  Download

https://hal.inria.fr/hal-01064766
Contributor : Suha Kwak <>
Submitted on : Wednesday, September 17, 2014 - 11:31:12 AM
Last modification on : Tuesday, September 22, 2020 - 3:49:56 AM
Long-term archiving on: : Thursday, December 18, 2014 - 10:31:43 AM

File

goelICDAR13.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01064766, version 1

Collections

Citation

Vibhor Goel, Anand Mishra, Karteek Alahari, C. V. Jawahar. Whole is Greater than Sum of Parts: Recognizing Scene Text Words. International Conference on Document Analysis and Recognition, Aug 2013, Washington DC, United States. ⟨hal-01064766⟩

Share

Metrics

Record views

399

Files downloads

1131