Domain adaptation for sequence labeling using hidden Markov models

Edouard Grave 1, 2 Guillaume Obozinski 3 Francis Bach 1, 2
2 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : Most natural language processing systems based on machine learning are not robust to domain shift. For example, a state-of-the-art syntactic dependency parser trained on Wall Street Journal sentences has an absolute drop in performance of more than ten points when tested on textual data from the Web. An efficient solution to make these methods more robust to domain shift is to first learn a word representation using large amounts of unlabeled data from both domains, and then use this representation as features in a supervised learning algorithm. In this paper, we propose to use hidden Markov models to learn word representations for part-of-speech tagging. In particular, we study the influence of using data from the source, the target or both domains to learn the representation and the different ways to represent words using an HMM.
Type de document :
Communication dans un congrès
New Directions in Transfer and Multi-Task: Learning Across Domains and Tasks (NIPS Workshop), Dec 2013, Lake Tahoe, United States. 2013
Liste complète des métadonnées

Littérature citée [16 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00918371
Contributeur : Edouard Grave <>
Soumis le : vendredi 13 décembre 2013 - 14:11:27
Dernière modification le : vendredi 25 mai 2018 - 12:02:06
Document(s) archivé(s) le : mardi 18 mars 2014 - 12:35:35

Fichiers

da.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00918371, version 1
  • ARXIV : 1312.4092

Citation

Edouard Grave, Guillaume Obozinski, Francis Bach. Domain adaptation for sequence labeling using hidden Markov models. New Directions in Transfer and Multi-Task: Learning Across Domains and Tasks (NIPS Workshop), Dec 2013, Lake Tahoe, United States. 2013. 〈hal-00918371〉

Partager

Métriques

Consultations de la notice

596

Téléchargements de fichiers

245