Skip to Main content Skip to Navigation
Conference papers

Mapping Transcripts to Handwritten Text

Abstract : In the analysis and recognition of handwriting, a useful first task is to assign ground truth for words in the writing. Such an assignment is useful for various subsequent machine learning tasks for performing automatic recognition, writer verification, etc. Since automatic word segmentation and recognition can be error prone, an intermediate approach is to use a text file that is a transcription of the handwriting image for performing ground truth assignment. This paper describes an algorithm for finding the best word level alignment between the transcript and the handwriting image. The algorithm is useful in tasks such as: (i) extracting words and characters as characteristic elements in writer verification and identification tasks; (ii) creating a large ground-truthed dataset for handwriting document analysis (in word and even character levels); (iii) indexing a collection of handwritten materials for document retrieval, such as for historical manuscripts. The algorithm achieves an 84.7% accuracy in aligning words on whole images when evaluated on 20 pages from a handwriting database created for forensic document examination studies.
Complete list of metadata

https://hal.inria.fr/inria-00112763
Contributor : Anne Jaigu <>
Submitted on : Thursday, November 9, 2006 - 4:15:37 PM
Last modification on : Tuesday, August 13, 2019 - 11:40:13 AM
Long-term archiving on: : Tuesday, April 6, 2010 - 10:02:54 PM

Identifiers

  • HAL Id : inria-00112763, version 1

Collections

Citation

Chen Huang, Sargur N. Srihari. Mapping Transcripts to Handwritten Text. Tenth International Workshop on Frontiers in Handwriting Recognition, Université de Rennes 1, Oct 2006, La Baule (France). ⟨inria-00112763⟩

Share

Metrics

Record views

255

Files downloads

467