Bayesian networks for incomplete data analysis in form processing

Emilie Philippot 1 Santosh K.C. 1 Abdel Belaïd 1 Yolande Belaïd 1
1 READ - Recognition of writing and analysis of documents
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : In this paper, we study Bayesian network (BN) for form identification based on partially filled fields. It uses electronic ink-tracing files without having any information about form structure. Given a form format, the ink-tracing files are used to build the BN by providing the possible relationships between corresponding fields using conditional probabilities, that goes from individual fields up to the complete model construction. To simplify the BN, we sub-divide a single form into three different areas: header, body and footer, and integrate them together, where we study three fundamental BN learning algorithms: Naive, Peter & Clark (PC) and maximum weighted spanning tree (MWST). Under this framework, we validate it with a real-world industrial problem i.e., electronic note-taking in form processing. The approach provides satisfactory results, attesting the interest of BN for exploiting the incomplete form analysis problems, in particular.
Type de document :
Article dans une revue
International journal of machine learning and cybernetics, 2014, pp.25. 〈10.1007/s13042-014-0234-4〉
Liste complète des métadonnées

Littérature citée [43 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01099727
Contributeur : Yolande Belaid <>
Soumis le : lundi 5 janvier 2015 - 10:55:20
Dernière modification le : mardi 24 avril 2018 - 13:32:39
Document(s) archivé(s) le : mercredi 3 juin 2015 - 13:06:13

Fichier

MLC_final.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Emilie Philippot, Santosh K.C., Abdel Belaïd, Yolande Belaïd. Bayesian networks for incomplete data analysis in form processing. International journal of machine learning and cybernetics, 2014, pp.25. 〈10.1007/s13042-014-0234-4〉. 〈hal-01099727〉

Partager

Métriques

Consultations de la notice

193

Téléchargements de fichiers

221