Sentence Boundary Detection for Handwritten Text Recognition - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2006

Sentence Boundary Detection for Handwritten Text Recognition

Résumé

In the larger context of handwritten text recognition systems many natural language processing techniques can potentially be applied to the output of such systems. However, these techniques often assume that the input is segmented into meaningful units, such as sentences. This paper investigates the use of hidden-event language models and a maximum entropy based method for sentence boundary detection. While hidden-event language models are simple to train, the maximum entropy framework allows for an easy integration of various knowledge sources. The segmentation performance of these two approaches are evaluated on the IAM Database for handwritten English text and results on true words as well as recognized words are provided. Finally, a combination of the two techniques is shown to achieve superior performance over both individual methods.
Fichier principal
Vignette du fichier
cr101742506742.pdf (88.6 Ko) Télécharger le fichier

Dates et versions

inria-00103835 , version 1 (05-10-2006)

Identifiants

  • HAL Id : inria-00103835 , version 1

Citer

Matthias Zimmermann. Sentence Boundary Detection for Handwritten Text Recognition. Tenth International Workshop on Frontiers in Handwriting Recognition, Université de Rennes 1, Oct 2006, La Baule (France). ⟨inria-00103835⟩

Collections

IWFHR10
52 Consultations
142 Téléchargements

Partager

Gmail Facebook X LinkedIn More