Skip to Main content Skip to Navigation
Conference papers

Searching for Truth in a Database of Statistics

Abstract : The proliferation of falsehood and misinformation, in particular through the Web, has lead to increasing energy being invested into journalistic fact-checking. Fact-checking journalists typically check the accuracy of a claim against some trusted data source. Statistic databases such as those compiled by state agencies are often used as trusted data sources, as they contain valuable, high-quality information. However, their usability is limited when they are shared in a format such as HTML or spreadsheets: this makes it hard to find the most relevant dataset for checking a specific claim, or to quickly extract from a dataset the best answer to a given query. We present a novel algorithm enabling the exploitation of such statistic tables, by (i) identifying the statistic datasets most relevant for a given fact-checking query, and (ii) extracting from each dataset the best specific (precise) query answer it may contain. We have implemented our approach and experimented on the complete corpus of statistics obtained from INSEE, the French national statistic institute. Our experiments and comparisons demonstrate the effectiveness of our proposed method.
Document type :
Conference papers
Complete list of metadata

Cited literature [11 references]  Display  Hide  Download
Contributor : Tien-Duc Cao Connect in order to contact the contributor
Submitted on : Wednesday, March 28, 2018 - 3:16:14 PM
Last modification on : Thursday, January 20, 2022 - 5:30:02 PM
Long-term archiving on: : Thursday, September 13, 2018 - 9:55:39 AM


Files produced by the author(s)


  • HAL Id : hal-01745768, version 1


Tien-Duc Cao, Ioana Manolescu, Xavier Tannier. Searching for Truth in a Database of Statistics. WebDB 2018 - 21st International Workshop on the Web and Databases, Jun 2018, Houston, United States. pp.1-6. ⟨hal-01745768⟩



Les métriques sont temporairement indisponibles