A. Abeillé, Lionel Clément, and François Toussenel. 2003. Building a Treebank for French, pp.165-187

A. Akbik, D. Blythe, and R. Vollgraf, Contextual string embeddings for sequence labeling, Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, pp.1638-1649, 2018.

R. Kubota, A. , and T. Zhang, A framework for learning predictive structures from multiple tasks and unlabeled data, J. Mach. Learn. Res, vol.6, pp.1817-1853, 2005.

R. Bawden, M. Botalla, K. Gerdes, and S. Kahane, Correcting and validating syntactic dependency in the spoken French treebank rhapsodie, Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pp.2320-2325, 2014.
URL : https://hal.archives-ouvertes.fr/halshs-01011059

M. Benesty, Ner algo benchmark: spacy, flair, m-bert and camembert on anonymizing french commercial legal cases, 2019.

F. Peter, . Brown, J. D. Vincent, P. V. Pietra, J. C. De-souza et al., Class-based n-gram models of natural language, Computational Linguistics, vol.18, issue.4, pp.467-479, 1992.

M. Candito and B. Crabbé, Improving generative statistical parsing with semisupervised word clustering, Proc. of IWPT'09, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00495267

M. Candito, G. Perrier, B. Guillaume, C. Ribeyre, and K. Fort, Deep syntax annotation of the sequoia french treebank, Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pp.2298-2305, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00969191

M. Candito and D. Seddah, Le corpus sequoia : annotation syntaxique et exploitation pour l'adaptation d'analyseur par pont lexical (the sequoia corpus : Syntactic annotation and use for a parser lexical domain adaptation method), Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, vol.2, pp.321-334, 2012.

B. Chan, T. Möller, M. Pietsch, T. Soni, and C. M. Yeung, , 2019.

A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek et al., Unsupervised cross-lingual representation learning at scale, 2019.

A. Conneau, R. Rinott, G. Lample, A. Williams, S. R. Bowman et al., XNLI: evaluating cross-lingual sentence representations, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp.2475-2485, 2018.

M. Andrew, . Dai, V. Quoc, and . Le, Semisupervised sequence learning, Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems, pp.3079-3087, 2015.

P. Delobelle, T. Winters, and B. Berendt, RobBERT: a Dutch RoBERTabased Language Model, 2020.

J. Devlin, M. Chang, K. Lee, and K. Toutanova, , 2018.

J. Devlin, M. Chang, K. Lee, K. Toutanova-;-milan-straka, J. Strnadová et al., BERT: pre-training of deep bidirectional transformers for language understanding, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, vol.Stella, 2018.

D. Nozza, F. Bianchi, and D. Hovy, What the [mask]? making sense of language-specific BERT models, 2020.

P. Suárez, B. Sagot, and L. Romary, Asynchronous pipeline for processing huge corpora on medium to low resource infrastructures, Challenges in the Management of Large Corpora (CMLC-7) 2019, p.9, 2019.

M. Ott, S. Edunov, A. Baevski, A. Fan, S. Gross et al., fairseq: A fast, extensible toolkit for sequence modeling, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, pp.48-53, 2019.

J. Pennington, R. Socher, and C. D. Manning, Glove: Global vectors for word representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, vol.2014, pp.1532-1543, 2014.

M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark et al., Deep contextualized word representations, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, vol.1, pp.2227-2237, 2018.

S. Petrov, D. Das, and R. T. Mcdonald, A universal part-of-speech tagset, Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, pp.2089-2096, 2012.

T. Pires, E. Schlinger, and D. Garrette, How multilingual is multilingual bert?, 2019.

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei et al., Language models are unsupervised multitask learners, OpenAI Blog, vol.1, issue.8, p.9, 2019.

C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang et al., Exploring the limits of transfer learning with a unified text-to-text transformer, 2019.

B. Sagot, M. Richard, and R. Stern, Annotation référentielle du corpus arboré de Paris 7 en entités nommées (referential named entity annotation of the paris 7 french treebank), Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, vol.2, pp.535-542, 2012.

M. Sanguinetti and C. Bosco, PartTUT: The Turin University Parallel Treebank, Harmonization and Development of Resources and Tools for Italian Natural Language Processing within the PARLI Project, vol.589, pp.51-69, 2015.

M. Schuster and K. Nakajima, Japanese and korean voice search, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5149-5152, 2012.

A. Seker, A. More, and R. Tsarfaty, Universal morpho-syntactic parsing and the contribution of lexica: Analyzing the onlp lab submission to the conll 2018 shared task, Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp.208-215, 2018.

R. Sennrich, B. Haddow, and A. Birch, Neural machine translation of rare words with subword units, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, vol.1, 2016.

M. Straka, Udpipe 2.0 prototype at conll 2018 ud shared task, Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp.197-207, 2018.

M. Straka and J. Straková, Evaluating contextualized embeddings on 54 languages in POS tagging, lemmatization and dependency parsing, 2019.

J. Straková, M. Straka, and ;. Korhonen, Neural architectures for nested NER through linearization, pp.5326-5331, 2019.

L. Wilson and . Taylor, cloze procedure": A new tool for measuring readability, Journalism Bulletin, vol.30, issue.4, pp.415-433, 1953.

I. Tenney, D. Das, and E. Pavlick, BERT rediscovers the classical NLP pipeline, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.4593-4601, 2019.

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones et al., Attention is all you need, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, pp.5998-6008, 2017.

A. Virtanen, J. Kanerva, R. Ilo, J. Luoma, J. Luotolahti et al.,

S. Ginter and . Pyysalo, Multilingual is not enough: Bert for finnish, 2019.

G. Wenzek, M. Lachaux, A. Conneau, V. Chaudhary, F. Guzmán et al., CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data, 2019.

A. Williams, N. Nangia, and S. R. Bowman, A broad-coverage challenge corpus for sentence understanding through inference, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.1112-1122, 2018.

T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue et al., Huggingface's transformers: State-of-the-art natural language processing, 2019.

S. Wu and M. Dredze, Beto, bentz, becas: The surprising cross-lingual effectiveness of BERT, 2019.

Z. Yang, Z. Dai, Y. Yang, J. G. Carbonell, R. Salakhutdinov et al.,

. Le, Xlnet: Generalized autoregressive pretraining for language understanding, 2019.