Geo-linguistic fingerprint and the evolution of languages in Twitter

Abstract : Having access to content of messages sent by some given group of subscribers of a social network may be used to identify (and quantify) some features of that group. The feature can stand for the level of interest in some event or product, or for the popularity of some idea, or a musical hit or of a political figure. The feature can also stand for the way the written language is used and transformed, the way words are spelled and grammer is used. In this paper we shall be interested in identifying features of groups of subscribers that have their geographic location and their language in common. We develop a methodology that allows one to perform such a study using a statistical tool which is freely available, and which makes use of a part of all tweets which twitter makes available for free over the Internet. The methodology is based on the fact that one can differentiate among some geographic areas according to the activity pattern of tweets during the time of the day. We present an application of this methodology to the study of new spellings or of new words created in twitter messages
Document type :
Reports
Liste complète des métadonnées

Cited literature [6 references]  Display  Hide  Download

https://hal.inria.fr/hal-00674853
Contributor : Eitan Altman <>
Submitted on : Tuesday, April 24, 2012 - 8:17:50 PM
Last modification on : Friday, March 22, 2019 - 11:34:06 AM
Document(s) archivé(s) le : Wednesday, July 25, 2012 - 2:21:26 AM

File

c6.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00674853, version 2

Collections

Citation

Eitan Altman, Yonathan Portilla. Geo-linguistic fingerprint and the evolution of languages in Twitter. [Research Report] 2012, pp.14. ⟨hal-00674853v2⟩

Share

Metrics

Record views

453

Files downloads

535