Boosting for Model Selection in Syntactic Parsing

Rachel Bawden

Mémoires D'étudiants -- Hal-Inria+ Année : 2015

Boosting for Model Selection in Syntactic Parsing

Une approche par boosting à la sélection de modèles pour l’analyse syntaxique statistique

(1)

Rachel Bawden

Fonction : Auteur
PersonId : 9441
IdHAL : rachel-bawden
ORCID : 0000-0001-9553-1768
IdRef : 233174591

Analyse Linguistique Profonde à Grande Echelle ; Large-scale deep linguistic processing

Résumé

In this work we present our approach to model selection for statistical parsing via boosting. The method is used to target the inefficiency of current feature selection methods, in that it allows a constant feature selection time at each iteration rather than the increasing selection time of current standard forward wrapper methods. With the aim of performing feature selection on very high dimensional data, in particular for parsing morphologically rich languages, we test the approach, which uses the multiclass AdaBoost algorithm SAMME (Zhu et al., 2006), on French data from the French Treebank, using a multilingual discriminative constituency parser (Crabbé, 2014). Current results show that the method is indeed far more efficient than a naïve method, and the performance of the models produced is promising, with F-scores comparable to carefully selected manual models. We provide some perspectives to improve on these performances in future work.

Mots clés

model selection feature selection boosting parsing syntactic parsing

Domaines

Apprentissage [cs.LG] Machine Learning [stat.ML] Linguistique

Fichier principal

rachel_bawden_memoire_m2.pdf (1.1 Mo)

Rachel Bawden : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01258945

Soumis le : mardi 19 janvier 2016-16:40:44

Dernière modification le : mardi 3 octobre 2023-17:18:04

Archivage à long terme le : vendredi 11 novembre 2016-13:13:04

Dates et versions

hal-01258945 , version 1 (19-01-2016)

Identifiants

HAL Id : hal-01258945 , version 1

Citer

Rachel Bawden. Boosting for Model Selection in Syntactic Parsing. Machine Learning [cs.LG]. 2015. ⟨hal-01258945⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-PARIS7 INRIA INRIA2

91 Consultations

90 Téléchargements

Boosting for Model Selection in Syntactic Parsing

Une approche par boosting à la sélection de modèles pour l’analyse syntaxique statistique

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager