95 articles 

inria-00103696, version 1

Discrimination Between Digits and Outliers in Handwritten Documents Applied to the Extraction of Numerical Fields

Clément Chatelain () 1, Laurent Heutte () 1, Thierry Paquet () 1

Tenth International Workshop on Frontiers in Handwriting Recognition (2006)

Abstract: In this article, we propose a numerical field extraction system from unconstrained handwritten documents. The system is based on a segmentation driven by recognition stage followed by a syntactical analysis which detects the sequences that may compose a numerical field. We focus here on the design of a digit classifier embedded in the segmentation/ recognition process able to discriminate digits from outliers such as words, fragment of words, noise, etc. For that, we have developed a light classifier used as prior to a standard digit classifier in order to reject “obvious outliers”. Several classifiers have been compared in terms of ROC curve and processing time.

  • 1:  Perception, Systèmes, Information (PSI)
  • CNRS : FRE2645 – Université de Rouen – Institut National des Sciences Appliquées [INSA] - Rouen
  • Domain : Computer Science/Document and Text Processing
    Computer Science/Computer Vision and Pattern Recognition
  • Keywords : Handwriting recognition – unconstrained documents – numerical field – information extraction
  • Comment : http://www.suvisoft.com
 
  • inria-00103696, version 1
  • oai:hal.inria.fr:inria-00103696
  • From: 
  • Submitted on: Thursday, 5 October 2006 10:18:17
  • Updated on: Thursday, 5 October 2006 11:32:21