95 articles 

inria-00112763, version 1

Mapping Transcripts to Handwritten Text

Chen Huang () 1, Sargur N. Srihari () 1

Tenth International Workshop on Frontiers in Handwriting Recognition (2006)

  • 1:  Center of Excellence for Document Analysis and Recognition (CEDAR)

  • State University of New York at Buffalo United States

Bibliographic reference

  • Type of document: Peer-reviewed conferences/proceedings
  • Domain:
    Computer Science/Document and Text Processing
    Computer Science/Computer Vision and Pattern Recognition
  • Title: Mapping Transcripts to Handwritten Text
  • Abstract: In the analysis and recognition of handwriting, a useful first task is to assign ground truth for words in the writing. Such an assignment is useful for various subsequent machine learning tasks for performing automatic recognition, writer verification, etc. Since automatic word segmentation and recognition can be error prone, an intermediate approach is to use a text file that is a transcription of the handwriting image for performing ground truth assignment. This paper describes an algorithm for finding the best word level alignment between the transcript and the handwriting image. The algorithm is useful in tasks such as: (i) extracting words and characters as characteristic elements in writer verification and identification tasks; (ii) creating a large ground-truthed dataset for handwriting document analysis (in word and even character levels); (iii) indexing a collection of handwritten materials for document retrieval, such as for historical manuscripts. The algorithm achieves an 84.7% accuracy in aligning words on whole images when evaluated on 20 pages from a handwriting database created for forensic document examination studies.
  • ACM Classification:
    I.: Computing Methodologies/I.5: PATTERN RECOGNITION
    I.: Computing Methodologies/I.7: DOCUMENT AND TEXT PROCESSING
  • Full text language: English
  • Publication date: 2006-10-23
  • Audience: not specified
  • Conference title: Tenth International Workshop on Frontiers in Handwriting Recognition
  • Conference city: La Baule (France)
  • Conference date: 2006-10-23
  • Organizer: Université de Rennes 1
  • Scientific editor(s): Guy Lorette
  • Commercial editor: Suvisoft
  • Keywords: Transcript mapping – word segmentation – word recognition
  • Comment: http://www.suvisoft.com
  • Contract, financing: Université de Rennes 1

Attached file list to this document: 

PDF
cr113081015841.pdf(513 KB)
 
  • inria-00112763, version 1
  • oai:hal.inria.fr:inria-00112763
  • From: 
  • Submitted on: Thursday, 9 November 2006 16:15:37
  • Updated on: Thursday, 9 November 2006 16:52:39