Mapping Transcripts to Handwritten Text - International Workshop on Frontiers in Handwriting Recognition Access content directly
Conference Papers Year : 2006

Mapping Transcripts to Handwritten Text

Abstract

In the analysis and recognition of handwriting, a useful first task is to assign ground truth for words in the writing. Such an assignment is useful for various subsequent machine learning tasks for performing automatic recognition, writer verification, etc. Since automatic word segmentation and recognition can be error prone, an intermediate approach is to use a text file that is a transcription of the handwriting image for performing ground truth assignment. This paper describes an algorithm for finding the best word level alignment between the transcript and the handwriting image. The algorithm is useful in tasks such as: (i) extracting words and characters as characteristic elements in writer verification and identification tasks; (ii) creating a large ground-truthed dataset for handwriting document analysis (in word and even character levels); (iii) indexing a collection of handwritten materials for document retrieval, such as for historical manuscripts. The algorithm achieves an 84.7% accuracy in aligning words on whole images when evaluated on 20 pages from a handwriting database created for forensic document examination studies.
Fichier principal
Vignette du fichier
cr113081015841.pdf (512.99 Ko) Télécharger le fichier

Dates and versions

inria-00112763 , version 1 (09-11-2006)

Identifiers

  • HAL Id : inria-00112763 , version 1

Cite

Chen Huang, Sargur N. Srihari. Mapping Transcripts to Handwritten Text. Tenth International Workshop on Frontiers in Handwriting Recognition, Université de Rennes 1, Oct 2006, La Baule (France). ⟨inria-00112763⟩

Collections

IWFHR10
166 View
215 Download

Share

Gmail Facebook X LinkedIn More