From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers - Archive ouverte HAL Access content directly
Conference Papers Year : 2014

From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers

(1) , (1) , (1)
1
Christophe Gérard
  • Function : Author
  • PersonId : 958533

Abstract

In this paper we present a statistical machine learning approach to neologism detection going some way beyond the use of exclusion lists. We explore the impact of three groups of features: form related, morpho-lexical and thematic features. The latter type of features has not yet been used in this kind of application and represents a way to access the semantic context of new words. The results suggest that form related features are helpful at the overall classification task, while morpho-lexical and thematic features better single out true neologisms.
Fichier principal
Vignette du fichier
logo.pdf (93.95 Ko) Télécharger le fichier
Vignette du fichier
main.pdf (363.06 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Format : Other
Loading...

Dates and versions

hal-00959079 , version 1 (02-06-2014)

Identifiers

  • HAL Id : hal-00959079 , version 1

Cite

Ingrid Falk, Delphine Bernhard, Christophe Gérard. From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers. LREC - The 9th edition of the Language Resources and Evaluation Conference, May 2014, Reykjavik, Iceland. ⟨hal-00959079⟩

Collections

SITE-ALSACE
460 View
1368 Download

Share

Gmail Facebook Twitter LinkedIn More