Model-Based Annotation of Online Handwritten Datasets

Abstract : Annotated datasets of handwriting are a prerequisite to attempt a variety of problems such as building recognizers, developing writer identication algorithms, etc. However, the annotation of large datasets is a tedious and expensive process, especially at the character or stroke level. In this paper we propose a novel, automated method for annotation at the character level, given a parallel corpus of online handwritten data and the corresponding text. The method employs a model-based handwriting synthesis unit to map the two corpora to the same space and the annotation is propagated to the word level and then to the individual characters using elastic matching. The initial results of annotation are used to improve the handwriting synthesis model for the user under consideration, which in turn renes the annotation. The method can take care of errors in the handwriting such as spurious and missing strokes or characters. The output is stored in the UPXInkML format.
Type de document :
Communication dans un congrès
Guy Lorette. Tenth International Workshop on Frontiers in Handwriting Recognition, Oct 2006, La Baule (France), Suvisoft, 2006
Liste complète des métadonnées

Littérature citée [9 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00105158
Contributeur : Anne Jaigu <>
Soumis le : mardi 10 octobre 2006 - 14:40:41
Dernière modification le : mardi 10 octobre 2006 - 16:30:36
Document(s) archivé(s) le : mardi 6 avril 2010 - 19:11:34

Identifiants

  • HAL Id : inria-00105158, version 1

Collections

Citation

Anand Kumar, A. Balasubramanian, Anoop Namboodiri, C.V. Jawahar. Model-Based Annotation of Online Handwritten Datasets. Guy Lorette. Tenth International Workshop on Frontiers in Handwriting Recognition, Oct 2006, La Baule (France), Suvisoft, 2006. 〈inria-00105158〉

Partager

Métriques

Consultations de la notice

253

Téléchargements de fichiers

234