Continuous variation in computational morphology - the example of Swiss German

Yves Scherrer 1
1 ALPAGE - Analyse Linguistique Profonde à Grande Echelle ; Large-scale deep linguistic processing
Inria Paris-Rocquencourt, UPD7 - Université Paris Diderot - Paris 7
Abstract : Most work in natural language processing is geared towards written, standardized language varieties. This focus is generally justified on practical grounds of data availability and socio-economical relevance, but does not always reflect the linguistic reality of sub-standard varieties. In this paper, we aim at the computational description of the morphology of a language with continuous internal variation, as it is encountered in most dialect landscapes. The work presented here is applied to Swiss German dialects; these dialects are well documented through dialectological research and are among the most lively ones in Europe in terms of social acceptance and media exposure. Our work is inspired by previous research in generative dialectology and computational linguistics, which attempts to derive multiple dialect systems from a single reference system with the help of hand-written transformation rules. Such transformation rules may be called \textbf{georeferenced}, in the sense that they link to a set of geographic coordinates that can be grounded on a map. We improve on this work in several respects. First, our model associates all rules with probabilistic maps extracted from linguistic atlases. This allows us to handle transition zones in which several variants are accepted. Second, we provide a full implementation of this model on the basis of finite-state transducers. In addition to finite-state composition, which derives dialectal word forms by applying several rules in cascade, we propose a second type of composition, map composition, to compute the area of validity of the derived word forms on the basis of the probabilistic maps associated with the rules. In this paper, we will focus on two aspects of the proposed model: its theoretical value as a computationally effective description of continuous linguistic variation, and its practical value as a word-level machine translation system from Standard German into the various Swiss German dialects. We evaluate the model on the latter aspect.
Type de document :
Communication dans un congrès
TheoreticAl and Computational MOrphology: New Trends and Synergies (TACMO), Jul 2013, Genève, Switzerland. 2013
Liste complète des métadonnées

Littérature citée [8 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00851251
Contributeur : Yves Scherrer <>
Soumis le : mardi 13 août 2013 - 10:25:54
Dernière modification le : jeudi 15 novembre 2018 - 20:27:26
Document(s) archivé(s) le : mercredi 5 avril 2017 - 20:47:49

Fichier

tacmo-abstract-ys.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00851251, version 1

Collections

Citation

Yves Scherrer. Continuous variation in computational morphology - the example of Swiss German. TheoreticAl and Computational MOrphology: New Trends and Synergies (TACMO), Jul 2013, Genève, Switzerland. 2013. 〈hal-00851251〉

Partager

Métriques

Consultations de la notice

564

Téléchargements de fichiers

197