Structured Penalties for Log-linear Language Models

Anil Nelakanti 1 Cédric Archambeau 1 Julien Mairal 2 Francis Bach 3, 4 Guillaume Bouchard 1, *
* Corresponding author
2 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
4 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : Language models can be formalized as loglinear regression models where the input features represent previously observed contexts up to a certain length m. The complexity of existing algorithms to learn the parameters by maximum likelihood scale linearly in nd, where n is the length of the training corpus and d is the number of observed features. We present a model that grows logarithmically in d, making it possible to efficiently leverage longer contexts. We account for the sequential structure of natural language using treestructured penalized objectives to avoid overfitting and achieve better generalization.
Document type :
Conference papers
Complete list of metadatas

Cited literature [29 references]  Display  Hide  Download

https://hal.inria.fr/hal-00904820
Contributor : Julien Mairal <>
Submitted on : Friday, November 15, 2013 - 12:00:56 PM
Last modification on : Tuesday, February 12, 2019 - 10:30:05 AM
Long-term archiving on : Sunday, February 16, 2014 - 4:31:00 AM

File

anil_emnlp.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00904820, version 1

Collections

Citation

Anil Nelakanti, Cédric Archambeau, Julien Mairal, Francis Bach, Guillaume Bouchard. Structured Penalties for Log-linear Language Models. EMNLP - Empirical Methods in Natural Language Processing, Oct 2013, Seattle, United States. pp.233-243. ⟨hal-00904820⟩

Share

Metrics

Record views

951

Files downloads

322