A New Table Extraction and Recovery Methodology with Little Use of Previous Knowledge
Résumé
A new methodology for table-form extraction and recovery with little previous knowledge is presented. The first module performs the identification of line intersections in a table-form, the second module detects and corrects wrong intersections produced by fault intersection segments or by table artefacts (smudges, overlapping of handwritten data and fault segments). In this module, an artefact identification method for handwritten filled table-forms is proposed. The proposed method aims to detect, identify and remove table-form artefacts with little use of previous knowledge. The third module performs the table-form cell extraction. The evaluation of the efficiency is carried out from a total of 305 table-form images. Experiments showed significant and promising results. The artefact identification method improves table-form interpretation rates. The proposed approach reached a successful rate up to 85%. The main advantage of the presented methodology is requiring little knowledge from documents, being able to apply for most of the tableforms.
Loading...