Navigating the Maze of Wikidata Query Logs

Angela Bonifati 1, 2 Wim Martens 3 Thomas Timm 3
1 BD - Base de Données
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
2 TYREX - Types and Reasoning for the Web
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : This paper provides an in-depth and diversified analysis of theWikidata query logs, recently made publicly available. Although theusage of Wikidata queries has been the object of recent studies, ouranalysis of the query traffic reveals interesting and unforeseenfindings concerning the usage, types of recursion, and the shapeclassification of complex recursive queries. Wikidata specificfeatures combined with recursion let us identify a significant subsetof the entire corpus that can be used by the community for furtherassessment. We considered and analyzed the queries across manydifferent dimensions, such as the robotic and organic queries, thepresence/absence of constants along with the correctly executed andtimed out queries. A further investigation that we pursue in thispaper is to find, given a query, a number of queries structurallysimilar to the given query. We provide a thorough characterization ofthe queries in terms of their expressive power, their topologicalstructure and shape, along with a deeper understanding of the usageof recursion in these logs. We make the code for the analysisavailable as open source.
Document type :
Conference papers
Complete list of metadatas

https://hal.inria.fr/hal-02096714
Contributor : Tyrex Equipe <>
Submitted on : Thursday, April 11, 2019 - 3:12:18 PM
Last modification on : Tuesday, May 28, 2019 - 2:16:45 PM

Links full text

Identifiers

Citation

Angela Bonifati, Wim Martens, Thomas Timm. Navigating the Maze of Wikidata Query Logs. WWW 2019 - The World Wide Web Conference, May 2019, San Francisco, United States. pp.127-138, ⟨10.1145/3308558.3313472⟩. ⟨hal-02096714⟩

Share

Metrics

Record views

81