Skip to Main content Skip to Navigation
New interface
Conference papers

Data Driven Lemmatization and Parsing of Italian

Abstract : This paper aims at presenting some preliminary results for data driven lemmatisation for Italian. Based on a joint lemmatisation and part-of-speech tagging models, our system relies on a architecture that has already been proved successful for French. 'Besides' intrinsic evaluation for this task, we want to measure its usefulness and adequacy by using our system as input for the task of parsing. This approach achieves state-of-the-art parsing accuracy on unlabeled text without any gold information supplied (83.70% of F1 score in a 10-fold cross-validation setting), without requiring any prior knowledge of the language. This shows that our methodology is perfectly suitable for wide coverage parsing of Italian
Document type :
Conference papers
Complete list of metadata
Contributor : Djamé Seddah Connect in order to contact the contributor
Submitted on : Friday, January 18, 2013 - 6:16:55 PM
Last modification on : Friday, January 21, 2022 - 3:22:15 AM

Links full text



Djamé Seddah, Joseph Le Roux, Benoît Sagot. Data Driven Lemmatization and Parsing of Italian. EVALITA 2011 - Evaluation of NLP and Speech Tools for Italian, Jan 2012, Rome, Italy. pp.249-256, ⟨10.1007/978-3-642-35828-9_27⟩. ⟨hal-00778153⟩



Record views