Entity Local Structure Graph Matching for Mislabeling Correction

Abstract : This paper proposes an entity local structure comparison approach based on inexact subgraph matching. The comparison results are used for mislabeling correction in the local structure. The latter represents a set of entity attribute labels which are physically close in a document image. It is modeled by an attributed graph describing the content and presentation features of the labels by the nodes and the geometrical features by the arcs. A local structure graph is matched with a structure model which represents a set of local structure model graphs. The structure model is initially built using a set of well chosen local structures based on a graph clustering algorithm and is then incrementally updated. The subgraph matching adopts a specific cost function that integrates the feature dissimilarities. The matched model graph is used to extract the missed labels, prune the extraneous ones and correct the erroneous label fields in the local structure. The evaluation of the structure comparison approach on 525 local structures extracted from 200 business documents achieves about 90% for recall and 95% for precision. The mislabeling correction rates in these local structures vary between 73% and 100%.
Type de document :
Communication dans un congrès
Document Analysis Systems, Apr 2016, Santorini, Greece. pp.257-262, 〈10.1109/DAS.2016.36〉
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01304257
Contributeur : Nihel Kooli <>
Soumis le : jeudi 21 juillet 2016 - 15:29:53
Dernière modification le : mardi 24 avril 2018 - 13:37:23

Fichier

1792a257.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Nihel Kooli, Abdel Belaïd, Aurélie Joseph, Vincent Poulain D 'Andecy. Entity Local Structure Graph Matching for Mislabeling Correction. Document Analysis Systems, Apr 2016, Santorini, Greece. pp.257-262, 〈10.1109/DAS.2016.36〉. 〈hal-01304257〉

Partager

Métriques

Consultations de la notice

249

Téléchargements de fichiers

64