Recognition of Table of Contents for Electronic Library - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2000

Recognition of Table of Contents for Electronic Library

Abdel Belaïd
  • Fonction : Auteur
Nabil Murshed
  • Fonction : Auteur

Résumé

A labeling approach for automatic recognition of Tables of Contents (ToC) is described in this paper. A prototype is used for electronic consulting of scientific papers in a digital library system named Calliope. This method operates on a roughly structured ASCII file, produced by OCR. The recognition approach operates by text labeling without using any a priori model. Labeling is based on a Part of Speech Tagging (PoS) which is initiated by a primary labeling of text component using some specific dictionaries.
Fichier non déposé

Dates et versions

inria-00099147 , version 1 (26-09-2006)

Identifiants

  • HAL Id : inria-00099147 , version 1

Citer

Abdel Belaïd, Nabil Murshed. Recognition of Table of Contents for Electronic Library. 4th International Workshop on Document Analysis Systems - DAS'2000, 2000, Rio de Janeiro, Brésil, 28 p. ⟨inria-00099147⟩
37 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More