Extracting linked data from statistic spreadsheets - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Extracting linked data from statistic spreadsheets

Résumé

Statistic data is an important sub-category of open data; it is interesting for many applications, including but not limited to data journalism, as such data is typically of high quality, and reflects (under an aggregated form) important aspects of a society's life such as births, immigration, economic output etc. However, such open data is often not published as Linked Open Data (LOD) limiting its usability. We provide a conceptual model for the open data comprised in statistic files published by INSEE, the leading French economic and societal statistics institute. Then, we describe a novel method for extracting RDF LOD populating an instance of this conceptual model. Our method was used to produce RDF data out of 20k+ Excel spreadsheets, and our validation indicates a 91% rate of successful extraction.
Fichier principal
Vignette du fichier
paper-HAL.pdf (297.87 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01583975 , version 1 (08-09-2017)

Identifiants

Citer

Tien Duc Cao, Ioana Manolescu, Xavier Tannier. Extracting linked data from statistic spreadsheets. International Workshop on Semantic Big Data, May 2017, Chicago, United States. pp.1 - 5, ⟨10.1145/3066911.3066914⟩. ⟨hal-01583975⟩
469 Consultations
219 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More