Segmentation-Based And Segmentation-Free Methods for Spotting Handwritten Arabic Words - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2006

Segmentation-Based And Segmentation-Free Methods for Spotting Handwritten Arabic Words

Résumé

Given a set of handwritten documents, a common goal is to search for a relevant subset. Attempting to find a query word or image in such a set of documents is called word spotting. Spotting handwritten words in documents written in the Latin alphabet, and more recently in Arabic, has received considerable attention. One issue is generating candidate word regions on a page. Attempting to definitely segment the document into such regions (automatic segmentation) can meet with some success, but the performance of such an algorithm is often a limiting factor in spotting performance. Another approach is to directly scan the image on the page without attempting to generate such a definite segmentation. A new algorithm for word spotting and a comparison of recent algorithms which act on previously unsegmented Arabic handwritten text is presented. The algorithms considered are an automated word segmentation method presented previously and a “segmentation free” algorithm which performs spotting directly on lines of unsegmented text. The segmentation free approach performs spotting and segmentation concurrently using a sliding window. The spotting method used to judge the performance of the algorithms is a character based method, but the results are independent of the actual spotting method used. The segmentation-free method performs an average of 5-10% better than the automated segmentation method, and manages to have a lower per query cost on unprocessed images. However, it has a larger per query cost on preprocessed documents.
Fichier principal
Vignette du fichier
cr1111243189725.pdf (1.19 Mo) Télécharger le fichier
Loading...

Dates et versions

inria-00112708 , version 1 (09-11-2006)

Identifiants

  • HAL Id : inria-00112708 , version 1

Citer

Gregory R. Ball, Sargur N. Srihari, Harish Srinivasan. Segmentation-Based And Segmentation-Free Methods for Spotting Handwritten Arabic Words. Tenth International Workshop on Frontiers in Handwriting Recognition, Université de Rennes 1, Oct 2006, La Baule (France). ⟨inria-00112708⟩

Collections

IWFHR10
383 Consultations
721 Téléchargements

Partager

Gmail Facebook X LinkedIn More