A. Akbik, D. Blythe, and R. Vollgraf, Contextual string embeddings for sequence labeling, Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, pp.1638-1649, 2018.

R. Al-rfou, B. Perozzi, and S. Skiena, Proceedings of the Seventeenth Conference on Computational Natural Language Learning, vol.22, pp.183-192, 2013.

B. Bohnet, R. Mcdonald, G. Simões, D. Andor, E. Pitler et al., Morphosyntactic tagging with a meta-BiLSTM model over context sensitive token encodings, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp.2642-2652, 2018.

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, Enriching word vectors with subword information, Transactions of the Association for Computational Linguistics, vol.5, pp.135-146, 2017.

M. Buch-kromann, The danish dependency treebank and the dtag treebank tool, 2nd Workshop on Treebanks and Linguistic Theories (TLT), pp.217-220, 2003.

B. Chan, T. Möller, M. Pietsch, T. Soni, and C. M. Yeung, , 2019.

W. Che, Y. Liu, Y. Wang, B. Zheng, and T. Liu, Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation, Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp.55-64, 2018.

S. Desrochers, C. Paradis, and V. M. Weaver, A validation of dram rapl power measurements, Proceedings of the Second International Symposium on Memory Systems, MEMSYS '16, pp.455-470, 2016.

J. Devlin, M. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv e-prints, 2018.

J. Devlin, M. Chang, K. Lee, and K. Toutanova, Multilingual BERT, 2018.

T. Dozat, D. Christopher, and . Manning, Deep biaffine attention for neural dependency parsing, 5th International Conference on Learning Representations, 2017.

T. Dozat, P. Qi, and C. D. Manning, Stanford's graph-based neural dependency parser at the CoNLL 2017 shared task, Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp.20-30, 2017.

E. Grave, P. Bojanowski, P. Gupta, A. Joulin, and T. Mikolov, Learning word vectors for 157 languages, Proceedings of the 11th Language Resources and Evaluation Conference, 2018.

S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, vol.9, issue.8, pp.1735-1780, 1997.

A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou et al., Fasttext.zip: Compressing text classification models, 2016.

A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, Bag of tricks for efficient text classification, Proceedings of the 15th Conference of the European Chapter, vol.2, pp.427-431, 2017.

D. Kondratyuk and M. Straka, , p.75, 2019.

, Languages, 1 Model: Parsing Universal Dependencies Universally. arXiv e-prints

Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma et al., ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, 2019.

Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi et al., Roberta: A robustly optimized BERT pretraining approach, 2019.

L. Martin, B. Muller, P. Suárez, Y. Dupont, and L. Romary, Éric Villemonte de la Clergerie, Djamé Seddah, and Benoît Sagot. 2019. CamemBERT: a Tasty French Language Model. arXiv e-prints

R. Mihalcea, Using Wikipedia for automatic word sense disambiguation, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pp.196-203, 2007.

T. Mikolov, E. Grave, P. Bojanowski, C. Puhrsch, A. Dit et al., Agnieszka Patejuk, Siyao Peng, Cenel-Augusto Perez, Guy Perrier, Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC 2018, 2018.

P. Suárez, B. Sagot, and L. Romary, Asynchronous pipeline for processing huge corpora on medium to low resource infrastructures, Challenges in the Management of Large Corpora (CMLC-7) 2019, p.9, 2019.

J. Pennington, R. Socher, and C. Manning, Glove: Global vectors for word representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.1532-1543, 2014.

M. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark et al., Deep contextualized word representations, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol.1, pp.2227-2237, 2018.

S. Petrov, D. Das, and R. T. Mcdonald, A universal part-of-speech tagset, Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, Istanbul, pp.2089-2096, 2012.

A. Radford and K. Narasimhan, Improving language understanding by generative pre-training, 2018.

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei et al., Language models are unsupervised multitask learners, OpenAI Blog, vol.1, issue.8, 2019.

C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang et al., Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arXiv e-prints, 2019.

A. Smith, B. Bohnet, J. Miryam-de-lhoneux, Y. Nivre, S. Shao et al., 82 treebanks, 34 models: Universal dependency parsing with multi-treebank models, Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp.113-123, 2018.

M. Straka, UDPipe 2.0 prototype at CoNLL 2018 UD shared task, Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp.197-207, 2018.

M. Straka and J. Straková, Tokenizing, POS tagging, lemmatizing and parsing UD 2.0 with UDPipe, Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp.88-99, 2017.

M. Straka and J. Straková, Evaluating contextualized embeddings on 54 languages in POS tagging, lemmatization and dependency parsing, 2019.

E. Strubell, A. Ganesh, and A. Mccallum, Energy and policy considerations for deep learning in NLP, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.3645-3650, 2019.

M. Taulé, M. A. Martí, and M. Recasens, Ancora: Multilevel annotated corpora for catalan and spanish, Proceedings of the International Conference on Language Resources and Evaluation, 2008.

H. Trieu, Q. V. Trinh, and . Le, A simple method for commonsense reasoning, 2018.

G. Wenzek, M. Lachaux, A. Conneau, V. Chaudhary, F. Guzmán et al., CC-Net: Extracting High Quality Monolingual Datasets from Web Crawl Data. arXiv e-prints, 2019.

F. Wu, S. Daniel, and . Weld, Open information extraction using Wikipedia, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp.118-127, 2010.

D. Zeman, J. Haji?, M. Popel, M. Potthast, M. Straka et al., CoNLL 2018 shared task: Multilingual parsing from raw text to universal dependencies, Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp.1-21, 2018.