95 articles 

inria-00104719, version 1

A New Table Extraction and Recovery Methodology with Little Use of Previous Knowledge

Luiz Antônio Pereira Neves () 12, João Marques De Carvalho () 1, Jacques Facon () 2, Flávio Bortolozzi () 2

Tenth International Workshop on Frontiers in Handwriting Recognition (2006)

Abstract: A new methodology for table-form extraction and recovery with little previous knowledge is presented. The first module performs the identification of line intersections in a table-form, the second module detects and corrects wrong intersections produced by fault intersection segments or by table artefacts (smudges, overlapping of handwritten data and fault segments). In this module, an artefact identification method for handwritten filled table-forms is proposed. The proposed method aims to detect, identify and remove table-form artefacts with little use of previous knowledge. The third module performs the table-form cell extraction. The evaluation of the efficiency is carried out from a total of 305 table-form images. Experiments showed significant and promising results. The artefact identification method improves table-form interpretation rates. The proposed approach reached a successful rate up to 85%. The main advantage of the presented methodology is requiring little knowledge from documents, being able to apply for most of the tableforms.

  • 1:  Université Fédérale de Campina Grande [Brésil] (UFCG)
  • Université Fédérale de Campina Grande
  • 2:  Pontifical Catholic University of Paraná (PUCPR)
  • Pontifical Catholic University of Paraná
  • Domain : Computer Science/Document and Text Processing
    Computer Science/Computer Vision and Pattern Recognition
  • Keywords : Table-form recognition – Table-form extraction – handwritten data – Document segmentation
  • Comment : http://www.suvisoft.com
 
  • inria-00104719, version 1
  • oai:hal.inria.fr:inria-00104719
  • From: 
  • Submitted on: Monday, 9 October 2006 11:39:47
  • Updated on: Monday, 9 October 2006 12:09:12