Skip to Main content Skip to Navigation
Conference papers

Model-Based Annotation of Online Handwritten Datasets

Abstract : Annotated datasets of handwriting are a prerequisite to attempt a variety of problems such as building recognizers, developing writer identication algorithms, etc. However, the annotation of large datasets is a tedious and expensive process, especially at the character or stroke level. In this paper we propose a novel, automated method for annotation at the character level, given a parallel corpus of online handwritten data and the corresponding text. The method employs a model-based handwriting synthesis unit to map the two corpora to the same space and the annotation is propagated to the word level and then to the individual characters using elastic matching. The initial results of annotation are used to improve the handwriting synthesis model for the user under consideration, which in turn renes the annotation. The method can take care of errors in the handwriting such as spurious and missing strokes or characters. The output is stored in the UPXInkML format.
Complete list of metadata

Cited literature [9 references]  Display  Hide  Download

https://hal.inria.fr/inria-00105158
Contributor : Anne Jaigu <>
Submitted on : Tuesday, October 10, 2006 - 2:40:41 PM
Last modification on : Tuesday, October 10, 2006 - 4:30:36 PM
Long-term archiving on: : Tuesday, April 6, 2010 - 7:11:34 PM

Identifiers

  • HAL Id : inria-00105158, version 1

Collections

Citation

Anand Kumar, A. Balasubramanian, Anoop Namboodiri, C.V. Jawahar. Model-Based Annotation of Online Handwritten Datasets. Tenth International Workshop on Frontiers in Handwriting Recognition, Université de Rennes 1, Oct 2006, La Baule (France). ⟨inria-00105158⟩

Share

Metrics

Record views

301

Files downloads

292