inria-00176680, version 1
XML data representation in Document Image Analysis
Abdel Belaid a, 1Ingrid Falk 2Yves Rangoni
a, 1
9th International Conference on Document Analysis and Recognition - ICDAR'07 (2007) 78-82
Abstract: This paper presents the XML-based formats ALTO, TEI, METS used for Digital Libraries and their interest for data representation in a Document Image Analysis and Recognition (DIAR) process. In the first part we briefly present these formats with focus on their adequacy for structural representation and modeling of DIAR data. The second part shows how these formats can be used in a reverse engineering process. Their implementation as a data representation framework will be shown.
- a – Université Nancy II
- 1: READ (LORIA)
- INRIA – CNRS : UMR7503 – Université Henri Poincaré - Nancy I – Université Nancy II – Institut National Polytechnique de Lorraine (INPL)
- 2: TALARIS (INRIA Nancy - Grand Est / LORIA)
- CNRS : UMR7503 – INRIA – Université Henri Poincaré - Nancy I – Université Nancy II – Institut National Polytechnique de Lorraine (INPL)
- Domain : Computer Science/Computer Vision and Pattern Recognition
Computer Science/Document and Text Processing - Keywords : XML – TEI – ALTO – METS – Document Image Analysis and Recognition – XSLT – Reverse Engineering – Document Class Model
- inria-00176680, version 1
- http://hal.inria.fr/inria-00176680
- oai:hal.inria.fr:inria-00176680
- From: Yves Rangoni
- Submitted on: Thursday, 4 October 2007 13:25:51
- Updated on: Tuesday, 31 May 2011 10:31:29






Export