Notation-Parametric Grammar Recovery

Vadim Zaytsev 1
1 ATEAMS - Analysis and Transformation based on rEliAble tool coMpositionS
Inria Lille - Nord Europe, CWI - Centrum Wiskunde & Informatica
Abstract : Automation of grammar recovery is an important research area that received attention over the last decade and a half. Given the abundance of available documentation for software languages that is only going to keep increasing in the future, there is need for reliable extraction techniques that allow grammar engineers to derive useful information from it. This information can be further used to build grammarware, like parsers or test generators, or to perform grammar investigation. Grammars obtained systematically from existing sources always have preference over manually constructed ones due to traceability of their issues, including errors and design weaknesses. This paper focuses on automated grammar recovery from sources that utilise a family of metasyntaxes known as EBNF: many language specifications extend the well-studied Backus Naur Form in different directions, resulting in unnecessary diversity of syntactic notations. To enable manipulation of EBNF families, we use EDD, the EBNF Dialect Definition, a recently published DSL for notation specification, and base our approach on human-specified indications that guide the subsequent automated heuristic-based recovery process. Two separate scenarios are considered in the paper: a reliable syntactic notation and an unreliable one, with the latter being remarkably more difficult to handle, but also substantially more useful since it is so often encountered in practice. The proposed approach has been verified by two prototypes that were capable of extracting dozens of grammars written in 42 different syntactic notations.
Document type :
Conference papers
Liste complète des métadonnées

https://hal.inria.fr/hal-00756889
Contributor : Jurgen Vinju <>
Submitted on : Friday, November 23, 2012 - 10:28:06 PM
Last modification on : Friday, February 9, 2018 - 2:02:05 PM

Identifiers

  • HAL Id : hal-00756889, version 1

Collections

Citation

Vadim Zaytsev. Notation-Parametric Grammar Recovery. Pre-proceedings of the 12th International Workshop on Language Descriptions, Tools, and Applications (LDTA 2012), Mar 2012, Talinn, Estonia. pp.105 - 118. ⟨hal-00756889⟩

Share

Metrics

Record views

110