Event detection and analysis on short text messages

Abstract : In the latest years, the Web has shifted from a read-only medium where most users could only consume information to an interactive medium allowing every user to create, share and comment information. The downside of social media as an information source is that often the texts are short, informal and lack contextual information. On the other hand, the Web also contains structured Knowledge Bases (KBs) that could be used to enrich the user-generated content. This dissertation investigates the potential of exploiting information from the Linked Open Data KBs to detect, classify and track events on social media, in particular Twitter. More specifically, we address 3 research questions: i) How to extract and classify messages related to events? ii) How to cluster events into fine-grained categories? and 3) Given an event, to what extent user-generated contents on social medias can contribute in the creation of a timeline of sub-events? We provide methods that rely on Linked Open Data KBs to enrich the context of social media content; we show that supervised models can achieve good generalisation capabilities through semantic linking, thus mitigating overfitting; we rely on graph theory to model the relationships between NEs and the other terms in tweets in order to cluster fine-grained events. Finally, we use in-domain ontologies and local gazetteers to identify relationships between actors involved in the same event, to create a timeline of sub-events. We show that enriching the NEs in the text with information provided by LOD KBs improves the performance of both supervised and unsupervised machine learning models.
Liste complète des métadonnées

Cited literature [139 references]  Display  Hide  Download

https://hal.inria.fr/tel-01680769
Contributor : Amosse Edouard <>
Submitted on : Thursday, January 11, 2018 - 10:03:43 AM
Last modification on : Monday, November 5, 2018 - 3:52:10 PM

File

Edouard_Thesis.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-01680769, version 1

Collections

Citation

Amosse Edouard. Event detection and analysis on short text messages. Data Structures and Algorithms [cs.DS]. Université Côte D'Azur, 2017. English. ⟨tel-01680769⟩

Share

Metrics

Record views

226

Files downloads

270