A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2006

A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research

Résumé

This paper presents a new comprehensive database for isolated offline handwritten Farsi/Arabic numbers and characters for use in optical character recognition research. The database is freely available for academic use. So far no such a freely database in Farsi language is available. Grayscale images of 52,380 characters and 17,740 numerals are included. Each image was scanned from Iranian school entrance exam forms during the years 2004-2006 at 300 dpi. The only restriction imposed on the writers is to write each character within a rectangular box. The number of samples in each class of the database is non-uniform corresponding to their real life distributions. Also, for comparison purposes, each dataset has been properly divided into respective training and test sets.
Fichier principal
Vignette du fichier
cr1096180506660.pdf (428.36 Ko) Télécharger le fichier
Loading...

Dates et versions

inria-00112676 , version 1 (09-11-2006)

Identifiants

  • HAL Id : inria-00112676 , version 1

Citer

Saeed Mozaffari, Karim Faez, Farhad Faradji, Majid Ziaratban, S. Mohamad Golzan. A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research. Tenth International Workshop on Frontiers in Handwriting Recognition, Université de Rennes 1, Oct 2006, La Baule (France). ⟨inria-00112676⟩

Collections

IWFHR10
1589 Consultations
1720 Téléchargements

Partager

Gmail Facebook X LinkedIn More