Federating clustering and cluster labelling capabilities with a single approach based on feature maximization: French verb classes identification with IGNGF neural clustering.

Abstract : Classifications which group together verbs and a set of shared syntactic and semantic properties have proven to be useful in both linguistics and Natural Language Processing tasks. However, most existing approaches for automatically acquiring verb classes fail to associate the verb classes produced with an explicit characterisation of the syntactic and semantic properties shared by the class elements. We propose a novel approach to verb clustering which addresses this shortcoming and permits building verb classifications whose classes group together verbs, subcategorisation frames and thematic grids. Our approach involves the use of a recent neural clustering method called IGNGF (Incremental Growing Neural Gas with Feature maximization). The use of a standard distance measure for determining a winner is replaced in IGNGF by feature maximisation measure relying on the features of the data that are associated with clusters during learning. A main advantage of the method is that maximised features used by IGNGF during learning can also be exploited in a final step for accurately labelling the resulting clusters. In this paper, we exploit IGNGF for the unsupervised classification of French verbs and evaluate the obtained clusters (i.e., verb classes) in two different ways. The first way is a quantitative analysis of the clustering process relying on a usual gold standard and on complementary unbiased clustering quality indexes. The second way is a qualitative analysis of the cluster labelling process. Relying on an adapted gold standard, we evaluate the capacity of the IGNGF clusters labels (i.e., subcategorisation frames and thematic grids) to be exploited for bootstraping a VerbNet-like classification for French. Both analyses clearly highlight the advantages of the approach.
Type de document :
Article dans une revue
Neurocomputing, Elsevier, 2015, 147, pp.136-146. 〈10.1016/j.neucom.2014.02.060〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01074277
Contributeur : Ingrid Falk <>
Soumis le : mardi 14 octobre 2014 - 14:34:09
Dernière modification le : mardi 24 avril 2018 - 13:32:34

Identifiants

Citation

Jean-Charles Lamirel, Ingrid Falk, Claire Gardent. Federating clustering and cluster labelling capabilities with a single approach based on feature maximization: French verb classes identification with IGNGF neural clustering.. Neurocomputing, Elsevier, 2015, 147, pp.136-146. 〈10.1016/j.neucom.2014.02.060〉. 〈hal-01074277〉

Partager

Métriques

Consultations de la notice

260