Contribution to the Automatic Recognition of Business Documents

Djamel Gaceb 1 Frank Lebourgeois 1 Véronique Eglin 1 Hubert Emptoz 1
1 imagine - Extraction de Caractéristiques et Identification
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : The automatic processing of paper documents and mails is a major challenge for all companies. Current recognition systems use modular architectures in which each stage of the process is independent. To improve the performances, it is necessary to reintroduce a cooperation between the different modules, for example by coupling the segmentation / recognition or zones of interests location / segmentation steps. In this context we propose a mixed approach for text localization and image segmentation which respects real time constraints. In the first part, we are going to present the state of the art in text location and thresholding in the images of postal addresses. In the second part, we will describe our method which simultaneously localize and segment text zones. The Location of text blocks obtained from a multiresolution approach on cumulated gradients computed directly from grey level images. The coupling of the two processes (text zones location and thresholding) allows to reduce simultaneously the computing time by processing only necessary parts of the image and by obtaining a better character segmentation for the OCR (Optical Character Recognition). We will present the results obtained from the implementation of our approach on an industrial line which daily processes several tons of documents from large companies.
Complete list of metadatas

Cited literature [28 references]  Display  Hide  Download

https://hal.inria.fr/inria-00104169
Contributor : Anne Jaigu <>
Submitted on : Friday, October 6, 2006 - 8:47:39 AM
Last modification on : Thursday, February 7, 2019 - 3:08:52 PM
Long-term archiving on : Tuesday, April 6, 2010 - 6:37:28 PM

Identifiers

  • HAL Id : inria-00104169, version 1

Citation

Djamel Gaceb, Frank Lebourgeois, Véronique Eglin, Hubert Emptoz. Contribution to the Automatic Recognition of Business Documents. Tenth International Workshop on Frontiers in Handwriting Recognition, Université de Rennes 1, Oct 2006, La Baule (France), France. ⟨inria-00104169⟩

Share

Metrics

Record views

474

Files downloads

259