A Priori and A Posteriori Integration and Combination of Language Models in an On-line Handwritten Sentence Recognition System

Solen Quiniou () a1, Eric Anquetil () b1

Tenth International Workshop on Frontiers in Handwriting Recognitio (2006)

Abstract: This paper investigates the integration of different language models into an on-line sentence recognition system. The impact of n-gram and n-class (based on statistically and on morpho-syntactically classes) models, built on the Brown corpus, is compared in terms of word recognition rate. Furthermore, their integration in different steps of the recognition process (during it or to rescore the Nbest list of proposed sentences) is considered, thus showing better performances when used the sooner. Combinations of these models are also studied, in addition to the integration in the aforementioned recognition steps. All experiments are carried out on sentences from the Brown corpus which were written by several writers.

  • a –  Université Rennes I
  • b –  Institut National des Sciences Appliquées de Rennes
  • 1:  IMADOC (IRISA)
  • Institut National des Sciences Appliquées (INSA) - Rennes – CNRS : UMR6074 – Université de Rennes 1
  • Domain : Computer Science/Computer Vision and Pattern Recognition
    Computer Science/Document and Text Processing
  • Keywords : On-line sentence recognition – statistical language models – model combination – N-best list rescoring.
