Optimization with Sparsity-Inducing Penalties, Machine Learning, pp.1-106, 2012. ,
DOI : 10.1561/2200000015
URL : https://hal.archives-ouvertes.fr/hal-00613125
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems, SIAM Journal on Imaging Sciences, vol.2, issue.1, pp.183-202, 2009. ,
DOI : 10.1137/080716542
Neural Probabilistic Language Models, Journal of Machine Learning Research, vol.3, pp.1137-1155, 2003. ,
DOI : 10.1007/3-540-33486-6_6
URL : https://hal.archives-ouvertes.fr/hal-01434258
An O(n) algorithm for quadratic knapsack problems, Operations Research Letters, vol.3, issue.3, pp.163-166, 1984. ,
DOI : 10.1016/0167-6377(84)90010-5
Analysis of Feature Extraction and Channel Compensation in a GMM Speaker Recognition System, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.7, pp.151979-1986, 2007. ,
DOI : 10.1109/TASL.2007.902499
Exact decoding of phrase-based translation models through lagrangian relaxation, Proc. Conf. Empirical Methods for Natural Language Processing, pp.26-37, 2011. ,
A survey of smoothing techniques for ME models, IEEE Transactions on Speech and Audio Processing, vol.8, issue.1, pp.37-50, 2000. ,
DOI : 10.1109/89.817452
An Introduction to Algorithms, 1990. ,
Inducing features of random fields, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.19, issue.4, pp.380-393, 1997. ,
Efficient projections onto the ? 1 -ball for learning in high dimensions, Proc. 25th Int. Conf. Machine Learning, 2008. ,
From ukkonen to Mc- Creight and weiner: A unifying view of linear-time suffix tree construction, Algorithmica, 1997. ,
A bit of progress in language modelling, Computer Speech and Language, pp.403-434, 2001. ,
Exponential priors for maximum entropy models, Proc. North American Chapter of the Association of Computational Linguistics, 2004. ,
Accelerated gradient methods for stochastic optimization and online learning, Advances in Neural Information Processing Systems, 2009. ,
Proximal methods for hierarchical sparse coding, Journal of Machine Learning Research, vol.12, pp.2297-2334, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00516723
Improved backing-off for M-gram language modeling, 1995 International Conference on Acoustics, Speech, and Signal Processing, 1995. ,
DOI : 10.1109/ICASSP.1995.479394
Structured sparsity in structured prediction, Proc. Conf. Empirical Methods for Natural Language Processing, pp.1500-1511, 2011. ,
Generalized linear models. Chapman and Hall, 1989. ,
Three new graphical models for statistical language modelling, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007. ,
DOI : 10.1145/1273496.1273577
A scalable hierarchical distributed language model, Advances in Neural Information Processing Systems, 2008. ,
Gradient methods for minimizing composite objective function. CORE Discussion Pa- per, 2007. ,
Discriminative language modeling with conditional random fields and the perceptron algorithm, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics , ACL '04, 2004. ,
DOI : 10.3115/1218955.1218962
Srilm-an extensible language modeling toolkit, Proc. Int. Conf. Spoken Language Processing, pp.901-904, 2002. ,
Online construction of suffix trees, Algorithmica, 1995. ,
Explicit relevance models in intent-oriented information retrieval diversification, Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, SIGIR '12, pp.75-84, 2012. ,
DOI : 10.1145/2348283.2348297
A stochastic memoizer for sequence data, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009. ,
DOI : 10.1145/1553374.1553518
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.7670
The sequence memoizer, Communications of the ACM, vol.54, issue.2, pp.91-98, 2011. ,
DOI : 10.1145/1897816.1897842
Efficient training methods for maximum entropy language modeling, Proc. 6th Inter. Conf. Spoken Language Technologies, pp.114-117, 2000. ,
The composite absolute penalties family for grouped and hierarchical variable selection. The Annals of Statistics, pp.3468-3497, 2009. ,