95 articles 

inria-00112676, version 1

A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research

Saeed Mozaffari () 1, Karim Faez () 1, Farhad Faradji () 1, Majid Ziaratban () 1, S. Mohamad Golzan () 1

Tenth International Workshop on Frontiers in Handwriting Recognition (2006)

  • 1:  Pattern Recognition and Image Processing Laboratory

  • Amirkabir University of Technology, Tehran Iran, Islamic Republic Of

Bibliographic reference

  • Type of document: Peer-reviewed conferences/proceedings
  • Domain:
    Computer Science/Document and Text Processing
    Computer Science/Computer Vision and Pattern Recognition
  • Title: A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research
  • Abstract: This paper presents a new comprehensive database for isolated offline handwritten Farsi/Arabic numbers and characters for use in optical character recognition research. The database is freely available for academic use. So far no such a freely database in Farsi language is available. Grayscale images of 52,380 characters and 17,740 numerals are included. Each image was scanned from Iranian school entrance exam forms during the years 2004-2006 at 300 dpi. The only restriction imposed on the writers is to write each character within a rectangular box. The number of samples in each class of the database is non-uniform corresponding to their real life distributions. Also, for comparison purposes, each dataset has been properly divided into respective training and test sets.
  • ACM Classification:
    I.: Computing Methodologies/I.5: PATTERN RECOGNITION
    I.: Computing Methodologies/I.7: DOCUMENT AND TEXT PROCESSING
  • Full text language: English
  • Publication date: 2006-10-23
  • Audience: not specified
  • Conference title: Tenth International Workshop on Frontiers in Handwriting Recognition
  • Conference city: La Baule (France)
  • Conference date: 2006-10-23
  • Organizer: Université de Rennes 1
  • Scientific editor(s): Guy Lorette
  • Commercial editor: Suvisoft
  • Keywords: OCR – Farsi/Arabic – Comparative database – offline – isolated numbers and characters
  • Comment: http://www.suvisoft.com
  • Contract, financing: Université de Rennes 1

Attached file list to this document: 

PDF
cr1096180506660.pdf(428.4 KB)
 
  • inria-00112676, version 1
  • oai:hal.inria.fr:inria-00112676
  • From: 
  • Submitted on: Thursday, 9 November 2006 15:13:20
  • Updated on: Thursday, 9 November 2006 16:50:07