From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers

Ingrid Falk
Connectez-vous pour contacter l'auteur
Delphine Bernhard
Christophe Gérard
  • Fonction : Auteur
  • PersonId : 958533

Résumé

In this paper we present a statistical machine learning approach to neologism detection going some way beyond the use of exclusion lists. We explore the impact of three groups of features: form related, morpho-lexical and thematic features. The latter type of features has not yet been used in this kind of application and represents a way to access the semantic context of new words. The results suggest that form related features are helpful at the overall classification task, while morpho-lexical and thematic features better single out true neologisms.
Fichier principal
Vignette du fichier
logo.pdf (93.95 Ko) Télécharger le fichier
main.pdf (363.06 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Format : Autre
Loading...

Dates et versions

hal-00959079 , version 1 (02-06-2014)

Identifiants

  • HAL Id : hal-00959079 , version 1

Citer

Ingrid Falk, Delphine Bernhard, Christophe Gérard. From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers. LREC - The 9th edition of the Language Resources and Evaluation Conference, May 2014, Reykjavik, Iceland. ⟨hal-00959079⟩

Collections

SITE-ALSACE
535 Consultations
1506 Téléchargements

Partager

Gmail Facebook X LinkedIn More