95 articles 

inria-00105158, version 1

Model-Based Annotation of Online Handwritten Datasets

Anand Kumar 1, A. Balasubramanian 1, Anoop Namboodiri 1, C.V. Jawahar () 1

Tenth International Workshop on Frontiers in Handwriting Recognition (2006)

Abstract: Annotated datasets of handwriting are a prerequisite to attempt a variety of problems such as building recognizers, developing writer identication algorithms, etc. However, the annotation of large datasets is a tedious and expensive process, especially at the character or stroke level. In this paper we propose a novel, automated method for annotation at the character level, given a parallel corpus of online handwritten data and the corresponding text. The method employs a model-based handwriting synthesis unit to map the two corpora to the same space and the annotation is propagated to the word level and then to the individual characters using elastic matching. The initial results of annotation are used to improve the handwriting synthesis model for the user under consideration, which in turn renes the annotation. The method can take care of errors in the handwriting such as spurious and missing strokes or characters. The output is stored in the UPXInkML format.

  • 1:  Center for Visual Information Technology (CVIT)
  • International Institute of Information Technology
  • Domain : Computer Science/Document and Text Processing
    Computer Science/Computer Vision and Pattern Recognition
  • Keywords : Annotation – Synthesis model – Elastic matching – Dynamic programming
  • Comment : http://www.suvisoft.com
 
  • inria-00105158, version 1
  • oai:hal.inria.fr:inria-00105158
  • From: 
  • Submitted on: Tuesday, 10 October 2006 14:40:41
  • Updated on: Tuesday, 10 October 2006 16:30:36