Intelligent selection of language model training data, Proceedings of the ACL 2010 Conference Short Papers, pp.220-224, 2010. ,
Improved models for Mandarin speech-to-text transcription, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.4660-4663, 2011. ,
DOI : 10.1109/ICASSP.2011.5947394
Enhancing the ted-lium corpus with selected data for language modeling and more ted talks, Proc. of LREC, pp.3935-3939, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01433246
A language modeling approach to entity recognition and disambiguation for search queries, Proceedings of the first international workshop on Entity recognition & disambiguation, ERD '14, pp.45-54, 2014. ,
DOI : 10.1145/2633211.2634347
Towards effective use of training data in statistical machine translation, Proceedings of the Seventh Workshop on Statistical Machine Translation, pp.317-321, 2012. ,
A novel neighborhood based document smoothing model for information retrieval, Information Retrieval, vol.43, issue.1, pp.391-425, 2013. ,
DOI : 10.1007/s10791-012-9202-3
Finding and identifying text in 900+ languages, Digital Investigation, pp.34-43, 2012. ,
DOI : 10.1016/j.diin.2012.05.004
The RWTH Large Vocabulary Arabic Handwriting Recognition System, 2014 11th IAPR International Workshop on Document Analysis Systems, pp.111-115, 2014. ,
DOI : 10.1109/DAS.2014.61
Two decades of statistical language modeling: where do we go from here?, Proceedings of the IEEE, vol.88, issue.8, 2000. ,
DOI : 10.1109/5.880083
Statistical methods for the recognition and understanding of speech, Encyclopedia of language and linguistics, 2004. ,
The ester phase ii evaluation campaign for the rich transcription of french broadcast news, Interspeech, pp.1149-1152, 2005. ,
The ester 2 evaluation campaign for the rich transcription of french radio broadcasts, Interspeech, pp.2583-2586, 2009. ,
The etape corpus for the evaluation of speech-based tv content processing in the french language, LREC-Eighth international conference on Language Resources and Evaluation, p.p. na, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00712591
The epac corpus: Manual and automatic annotations of conversational speech in french broadcast news, LREC, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-01433895
Transcription et traitement manuel de la parole spontanée pour sa reconnaissance automatique, 2011. ,
Selecting articles from the language model training corpus, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), pp.1695-1698, 2000. ,
DOI : 10.1109/ICASSP.2000.862077
The study of the effect of training set on statistical language modeling, INTERSPEECH, pp.721-724, 2001. ,
A unified approach to statistical language modeling for chinese, 2000. ,
Toward a unified approach to statistical language modeling for Chinese, ACM Transactions on Asian Language Information Processing, vol.1, issue.1, pp.3-33, 2002. ,
DOI : 10.1145/595576.595578
Method of selecting training data to build a compact and efficient translation model, IJCNLP, pp.655-660, 2008. ,
Discriminative instance weighting for domain adaptation in statistical machine translation, Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp.451-459, 2010. ,
Domain adaptation via pseudo indomain data selection, Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp.355-362, 2011. ,
Large, pruned or continuous space language models on a gpu for statistical machine translation Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, Proceedings of the NAACL-HLT 2012 Workshop, pp.11-19, 2012. ,
French gigaword second edition, 2009. ,
A Machine Learning Based Approach for Vocabulary Selection for Speech Transcription, Text, Speech, and Dialogue, pp.60-67, 2013. ,
DOI : 10.1007/978-3-642-40585-3_9
URL : https://hal.archives-ouvertes.fr/hal-00834302
Srilm-an extensible language modeling toolkit, INTERSPEECH, 2002. ,
Srilm at sixteen: Update and outlook, Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop, p.5, 2011. ,
An empirical study of smoothing techniques for language modeling, Proceedings of the 34th annual meeting on Association for Computational Linguistics, pp.310-318, 1996. ,