A combined strategy of analysis for the localization of heterogeneous form fields in ancient pre-printed records

Abstract : This paper deals with the location of handwritten fields in old pre-printed registers. The images present the difficulties of old and damaged documents, and we also have to face the difficulty of extracting the text due to the great interaction between handwritten and printed writing. In addition, in many collections, the structure of the forms varies according to the origin of the documents. This work is applied to a database of Mexican marriage records, which has been published for a competition in the workshop HIP 2013 and is publicly available. In this paper we show the interest and limitations of the empirical method which has been submitted for the competition. We then present a method that combines a logical description of the contents of the documents, with the result of an automatic analysis on the physical properties of the collection. The particularity of this analysis is that it does not require any ground truth. We show that this combined strategy can locate 97.2% of handwritten fields. The proposed approach is generalizable and could be applied to other databases.
Type de document :
Article dans une revue
International Journal on Document Analysis and Recognition, Springer Verlag, 2018, 21(4) (269-282), 〈10.1007/s10032-018-0309-y〉
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01858192
Contributeur : Aurélie Lemaitre <>
Soumis le : lundi 20 août 2018 - 10:56:00
Dernière modification le : vendredi 16 novembre 2018 - 01:40:41
Document(s) archivé(s) le : mercredi 21 novembre 2018 - 12:59:20

Fichier

article-acceptedmanuscript.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Aurélie Lemaitre, Jean Camillerapp, Cérès Carton, Bertrand Coüasnon. A combined strategy of analysis for the localization of heterogeneous form fields in ancient pre-printed records. International Journal on Document Analysis and Recognition, Springer Verlag, 2018, 21(4) (269-282), 〈10.1007/s10032-018-0309-y〉. 〈hal-01858192〉

Partager

Métriques

Consultations de la notice

269

Téléchargements de fichiers

44