Dynamic filters selection for textual document images denoising
Résumé
For a document class, one challenge in document restoration is to automatically find a set of filters, which are adapted to the degradation level of the images. Furthermore, it is important to know what filters and where they can be applied advantageously. In this paper, we present a multi classifiers solution for the extraction of linear filters. These filters are used for binarization and image denoising. The technique starts by clustering close pixels by K-means in as many clusters as filters. Each cluster is dedicated to a filter, which corresponds to a supervised neural network. These classifiers are trained according to a binarized image that is weighted function to erosion transformation effects. The presented method is compared to classical binarization techniques in literature. Its effect on the commercial OCR performances reaches a gain from 0,16% for Finereader7 and 1,06% for Omnipage14 for the recognition rate.
Domaines
Traitement du texte et du document
Origine : Fichiers produits par l'(les) auteur(s)
Loading...