HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Split and Rephrase

Abstract : We propose a new sentence simplification task (Split-and-Rephrase) where the aim is to split a complex sentence into a meaning preserving sequence of shorter sentences. Like sentence simplification, splitting-and-rephrasing has the potential of benefiting both natural language processing and societal applications. Because shorter sentences are generally better processed by NLP systems, it could be used as a preprocessing step which facilitates and improves the performance of parsers, semantic role labelers and machine translation systems. It should also be of use for people with reading disabilities because it allows the conversion of longer sentences into shorter ones. This paper makes two contributions towards this new task. First, we create and make available a benchmark consisting of 1,066,115 tu-ples mapping a single complex sentence to a sequence of sentences expressing the same meaning. 1 Second, we propose five models (vanilla sequence-to-sequence to semantically-motivated models) to understand the difficulty of the proposed task.
Document type :
Conference papers
Complete list of metadata

Cited literature [59 references]  Display  Hide  Download

Contributor : Claire Gardent Connect in order to contact the contributor
Submitted on : Wednesday, October 25, 2017 - 4:11:22 PM
Last modification on : Wednesday, November 24, 2021 - 9:54:10 AM
Long-term archiving on: : Friday, January 26, 2018 - 3:22:12 PM


Files produced by the author(s)


  • HAL Id : hal-01623746, version 1



Shashi Narayan, Claire Gardent, Shay Cohen, Anastasia Shimorina. Split and Rephrase. EMNLP 2017: Conference on Empirical Methods in Natural Language Processing, Sep 2017, Copenhagen, Denmark. pp.617 - 627. ⟨hal-01623746⟩



Record views


Files downloads