Sentence Boundary Detection for Handwritten Text Recognition

Matthias Zimmermann

Communication Dans Un Congrès Année : 2006

Sentence Boundary Detection for Handwritten Text Recognition

(1)

Matthias Zimmermann

Fonction : Auteur
PersonId : 835665

International Computer Science Institute [Berkeley]

Résumé

In the larger context of handwritten text recognition systems many natural language processing techniques can potentially be applied to the output of such systems. However, these techniques often assume that the input is segmented into meaningful units, such as sentences. This paper investigates the use of hidden-event language models and a maximum entropy based method for sentence boundary detection. While hidden-event language models are simple to train, the maximum entropy framework allows for an easy integration of various knowledge sources. The segmentation performance of these two approaches are evaluated on the IAM Database for handwritten English text and results on true words as well as recognized words are provided. Finally, a combination of the two techniques is shown to achieve superior performance over both individual methods.

Mots clés

Offline Handwritten Text Recognition Natural Language Processing

Domaines

Traitement du texte et du document Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

cr101742506742.pdf (88.6 Ko)

Anne Jaigu : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00103835

Soumis le : jeudi 5 octobre 2006-12:43:48

Dernière modification le : jeudi 5 octobre 2006-14:23:14

Archivage à long terme le : mardi 6 avril 2010-18:25:17

Dates et versions

inria-00103835 , version 1 (05-10-2006)

Identifiants

HAL Id : inria-00103835 , version 1

Citer

Matthias Zimmermann. Sentence Boundary Detection for Handwritten Text Recognition. Tenth International Workshop on Frontiers in Handwriting Recognition, Université de Rennes 1, Oct 2006, La Baule (France). ⟨inria-00103835⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IWFHR10

52 Consultations

142 Téléchargements

Sentence Boundary Detection for Handwritten Text Recognition

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager