HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Hybrid OCR combination for ancient documents

Hubert Cecotti 1 Abdel Belaïd 1
LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Commercial Optical Character Recognition (OCR) have at lot improved in the last few years. Their outstanding ability to process different kinds of documents is their main quality. However, their generality can also be an issue, as they cannot recognize perfectly documents far from the average present-day documents. We propose in this paper a system combining several OCRs and a specialized ICR (Intelligent Character Recognition) based on a convolutional neural network to complement them. Instead of just performing several OCRs in parallel and applying a fusing rule on the results, a specialized neural network with an adaptive topology is added to complement the OCRs, in function of the OCRs errors. This system has been tested on ancient documents containing old characters and old fonts not used in contemporary documents. The OCRs combination increases the recognition of about 3\% whereas the ICR improves the recognition of rejected characters of more than 5\%.
Complete list of metadata

Contributor : Hubert Cecotti Connect in order to contact the contributor
Submitted on : Tuesday, September 27, 2005 - 4:02:02 PM
Last modification on : Friday, February 26, 2021 - 3:28:06 PM




Hubert Cecotti, Abdel Belaïd. Hybrid OCR combination for ancient documents. Third International Conference on Advances in Pattern Recognition - ICAPR 2005, Aug 2005, Bath/UK, pp.646-653, ⟨10.1007/11551188⟩. ⟨inria-00000366⟩



Record views