inria-00103696, version 1
Discrimination Between Digits and Outliers in Handwritten Documents Applied to the Extraction of Numerical Fields
Tenth International Workshop on Frontiers in Handwriting Recognition (2006)
Abstract: In this article, we propose a numerical field extraction system from unconstrained handwritten documents. The system is based on a segmentation driven by recognition stage followed by a syntactical analysis which detects the sequences that may compose a numerical field. We focus here on the design of a digit classifier embedded in the segmentation/ recognition process able to discriminate digits from outliers such as words, fragment of words, noise, etc. For that, we have developed a light classifier used as prior to a standard digit classifier in order to reject “obvious outliers”. Several classifiers have been compared in terms of ROC curve and processing time.
- 1:
- CNRS : FRE2645 – Université de Rouen – Institut National des Sciences Appliquées (INSA) - Rouen
- Domain : Computer Science/Document and Text Processing
Computer Science/Computer Vision and Pattern Recognition - Keywords : Handwriting recognition – unconstrained documents – numerical field – information extraction
- Comment : http://www.suvisoft.com
- inria-00103696, version 1
- http://hal.inria.fr/inria-00103696
- oai:hal.inria.fr:inria-00103696
- From:
- Submitted on: Thursday, 5 October 2006 10:18:17
- Updated on: Thursday, 5 October 2006 11:32:21


Associated documents
Export