HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Improving Part-of-Speech Tagging of Historical Text by First Translating to Modern Text

Abstract : We explore the task of automatically assigning syntactic tags (known as part-of-speech tags) like Noun and Verb to words in seventeenth-century Dutch text. Tools exist for performing this task for modern texts but they perform poorly on historical texts because of language changes. We test several methods for translating the words in the historical text to modern equivalents before applying the tag assignment tools. We show that this additional translation step improves the quality of the automatic syntactic analysis. Further improvements are possible when the lexicons and text collections used for developing the translation process, are extended in size.
Document type :
Conference papers
Complete list of metadata

Cited literature [23 references]  Display  Hide  Download

Contributor : Hal Ifip Connect in order to contact the contributor
Submitted on : Friday, October 13, 2017 - 2:52:45 PM
Last modification on : Thursday, March 5, 2020 - 5:40:54 PM
Long-term archiving on: : Sunday, January 14, 2018 - 2:31:58 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License



Erik Sang. Improving Part-of-Speech Tagging of Historical Text by First Translating to Modern Text. 2nd International Workshop on Computational History and Data-Driven Humanities (CHDDH), May 2016, Dublin, Ireland. pp.54-64, ⟨10.1007/978-3-319-46224-0_6⟩. ⟨hal-01616302⟩



Record views


Files downloads