G. J. Brostow, J. Shotton, J. Fauqueur, and R. Cipolla, Segmentation and Recognition Using Structure from Motion Point Clouds, ECCV, pp.44-57, 2008.
DOI : 10.1007/978-3-540-88682-2_5

C. Case, B. Suresh, A. Coates, and A. Y. Ng, Autonomous sign reading for semantic mapping, 2011 IEEE International Conference on Robotics and Automation, 2011.
DOI : 10.1109/ICRA.2011.5980523

C. Chang and C. Lin, LIBSVM, ACM Transactions on Intelligent Systems and Technology, vol.2, issue.3, 2011.
DOI : 10.1145/1961189.1961199

X. Chen and A. L. Yuille, Detecting and reading text in natural scenes, CVPR (2), pp.366-373, 2004.

A. Coates, B. Carpenter, C. Case, S. Satheesh, B. Suresh et al., Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning, 2011 International Conference on Document Analysis and Recognition, pp.440-445, 2011.
DOI : 10.1109/ICDAR.2011.95

N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005.
DOI : 10.1109/CVPR.2005.177

URL : https://hal.archives-ouvertes.fr/inria-00548512

T. E. De-campos, B. R. Babu, and M. Varma, Character recognition in natural images, VISAPP, 2009.

C. Desai, D. Ramanan, and C. Fowlkes, Discriminative models for multi-class object layout, ICCV, 2009.

B. Epshtein, E. Ofek, and Y. Wexler, Detecting text in natural scenes with stroke width transform, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5540041

P. Felzenszwalb, R. Girshick, D. Mcallester, and D. Ramanan, Object Detection with Discriminatively Trained Part-Based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.9, 2010.
DOI : 10.1109/TPAMI.2009.167

S. Gould, T. Gao, and D. Koller, Region-based segmentation and object detection, NIPS, 2009.

A. Gupta, Y. Verma, and C. V. Jawahar, Choosing linguistics over vision to describe images, AAAI, 2012.

D. Hoiem, A. Efros, and M. Hebert, Closing the loop in scene interpretation, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587587

T. Judd, K. Ehinger, F. Durand, and A. Torralba, Learning to predict where humans look, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459462

V. Kolmogorov, Convergent Tree-Reweighted Message Passing for Energy Minimization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.28, issue.10, 2006.
DOI : 10.1109/TPAMI.2006.200

G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi et al., Baby talk: Understanding and generating simple image descriptions, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995466

L. Ladicky, P. Sturgess, K. Alahari, C. Russell, and P. H. Torr, What, Where and How Many? Combining Object Detectors and CRFs, ECCV, 2010.
DOI : 10.1007/978-3-642-15561-1_31

URL : https://hal.archives-ouvertes.fr/hal-01216730

J. Lafferty, A. Mccallum, and F. Pereira, Conditional random fields: Probabilistic models for segmenting and labelling sequence data Learning to combine bottom-up and top-down segmentation, ICML, pp.282-289, 2001.

L. Neumann and J. Matas, A Method for Text Localization and Recognition in Real-World Images, ACCV, 2010.
DOI : 10.1016/S0031-3203(03)00224-3

J. Pearl, Probabilistic Reasoning in Intelligent Systems : Networks of Plausible Inference, 1988.

J. Shotton, J. Winn, C. Rother, and A. Criminisi, TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context, International Journal of Computer Vision, vol.62, issue.1???2, pp.2-23, 2009.
DOI : 10.1007/s11263-007-0109-1

D. L. Smith and E. G. Field, Learned-Miller. Enforcing similarity constraints with integer programming for better scene text recognition, CVPR, 2011.

R. Smith, Limits on the Application of Frequency-Based Language Models to OCR, 2011 International Conference on Document Analysis and Recognition, 2011.
DOI : 10.1109/ICDAR.2011.114

P. Viola and M. Jones, Rapid object detection using a boosted cascade of simple features, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 2001.
DOI : 10.1109/CVPR.2001.990517

K. Wang, B. Babenko, and S. Belongie, End-to-end scene text recognition, ICCV, 2011.

K. Wang and S. Belongie, Word Spotting in the Wild, ECCV, pp.591-604, 2010.
DOI : 10.1007/978-3-642-15549-9_43

J. J. Weinman, E. G. Learned-miller, and A. R. Hanson, Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.10, 2009.
DOI : 10.1109/TPAMI.2009.38