From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers

Abstract : In this paper we present a statistical machine learning approach to neologism detection going some way beyond the use of exclusion lists. We explore the impact of three groups of features: form related, morpho-lexical and thematic features. The latter type of features has not yet been used in this kind of application and represents a way to access the semantic context of new words. The results suggest that form related features are helpful at the overall classification task, while morpho-lexical and thematic features better single out true neologisms.
Complete list of metadatas

Cited literature [24 references]  Display  Hide  Download

https://hal.inria.fr/hal-00959079
Contributor : Ingrid Falk <>
Submitted on : Monday, June 2, 2014 - 5:59:15 PM
Last modification on : Wednesday, October 16, 2019 - 6:36:08 AM
Long-term archiving on : Tuesday, September 2, 2014 - 10:35:55 AM

Files

logo.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00959079, version 1

Collections

Citation

Ingrid Falk, Delphine Bernhard, Christophe Gérard. From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers. LREC - The 9th edition of the Language Resources and Evaluation Conference, May 2014, Reykjavik, Iceland. ⟨hal-00959079⟩

Share

Metrics

Record views

444

Files downloads

1305