Skip to Main content Skip to Navigation

Mining and Modeling Variability from Natural Language Documents: Two Case Studies

Sana Ben Nasr 1
1 DiverSe - Diversity-centric Software Engineering
Inria Rennes – Bretagne Atlantique , IRISA-D4 - LANGAGE ET GÉNIE LOGICIEL
Abstract : Domain analysis is the process of analyzing a family of products to identify their common and variable features. This process is generally carried out by experts on the basis of existing informal documentation. When performed manually, this activity is both time-consuming and error-prone. In this thesis, our general contribution is to address mining and modeling variability from informal documentation. We adopt Natural Language Processing (NLP) and data mining techniques to identify features, commonalities, differences and features dependencies among related products. We investigate the applicability of this idea by instantiating it in two different contexts: (1) reverse engineering Feature Models (FMs) from regulatory requirements in nuclear domain and (2) synthesizing Product Comparison Matrices (PCMs) from informal product descriptions. In the first case study, we adopt NLP and data mining techniques based on semantic analysis, requirements clustering and association rules to assist experts when constructing feature models from these regulations. In the second case study, our proposed approach relies on contrastive analysis technology to mine domain specific terms from text, information extraction, terms clustering and information clustering. The main lesson learnt from the two case studies is that the exploitability and the extraction of variability knowledge depend on the context, the nature of variability and the nature of text.
Complete list of metadata
Contributor : Sana Ben Nasr Connect in order to contact the contributor
Submitted on : Wednesday, October 26, 2016 - 8:57:13 PM
Last modification on : Wednesday, November 3, 2021 - 6:05:48 AM


  • HAL Id : tel-01388392, version 1


Sana Ben Nasr. Mining and Modeling Variability from Natural Language Documents: Two Case Studies. Computer Science [cs]. Université Rennes 1, 2016. English. ⟨tel-01388392⟩



Record views


Files downloads