Analysing Data-To-Text Generation Benchmarks

Abstract : A generation system can only be as good as the data it is trained on. In this short paper , we propose a methodology for analysing data-to-text corpora used for training micro-planner i.e., systems which given some input must produce a text verbalising exactly this input. We apply this methodology to three existing benchmarks and we elicite a set of criteria for the creation of a data-to-text benchmark which could help better support the development , evaluation and comparison of linguistically sophisticated data-to-text generators.
Document type :
Conference papers
Complete list of metadatas

Cited literature [14 references]  Display  Hide  Download

https://hal.inria.fr/hal-01623832
Contributor : Claire Gardent <>
Submitted on : Wednesday, October 25, 2017 - 5:01:35 PM
Last modification on : Tuesday, December 18, 2018 - 4:38:02 PM
Long-term archiving on : Friday, January 26, 2018 - 3:13:02 PM

File

d2tDatasetAnalysis.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-01623832, version 1

Collections

Citation

Laura Perez-Beltrachini, Claire Gardent. Analysing Data-To-Text Generation Benchmarks. The 10th International Natural Language Generation conference., Sep 2017, Santiago de Compostelle, Spain. ⟨hal-01623832⟩

Share

Metrics

Record views

164

Files downloads

105