Improving pattern tracking with a language-aware tree differencing algorithm

Nicolas Palix 1 Jean-Rémy Falleri 2 Julia Lawall 3
1 Erods
LIG - Laboratoire d'Informatique de Grenoble, UJF - Université Joseph Fourier - Grenoble 1
3 Whisper - Well Honed Infrastructure Software for Programming Environments and Runtimes
LIP6 - Laboratoire d'Informatique de Paris 6, Inria Paris-Rocquencourt
Abstract : Tracking code fragments of interest is important in monitoring a software project over multiple versions. Various approaches, including our previous work on Herodotos, exploit the notion of Longest Common Subsequence, as computed by readily available tools such as GNU Diff, to map corresponding code fragments. Nevertheless, the efficient code differencing algorithms are typically line-based or word-based, and thus do not report changes at the level of language constructs. Furthermore, they identify only additions and removals, but not the moving of a block of code from one part of a file to another. Code fragments of interest that fall within the added and removed regions of code have to be manually correlated across versions, which is tedious and error-prone. When studying a very large code base over a long time, the number of manual correlations can become an obstacle to the success of a study. In this paper, we investigate the effect of replacing the current line-based algorithm used by Herodotos by tree-matching, as provided by the algorithm of the differencing tool GumTree. In contrast to the line-based approach, the tree-based approach does not generate any manual correlations, but it incurs a high execution time. To address the problem, we propose a hybrid strategy that gives the best of both approaches.
Type de document :
Communication dans un congrès
SANER 2015 - 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering, Mar 2015, Montreal, Canada. pp.43-52, 2015, SANER 2015 - 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering. 〈http://saner.soccerlab.polymtl.ca/doku.php?id=en:start〉. 〈10.1109/SANER.2015.7081814〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01213907
Contributeur : Julia Lawall <>
Soumis le : vendredi 6 novembre 2015 - 14:03:28
Dernière modification le : jeudi 11 janvier 2018 - 06:27:08
Document(s) archivé(s) le : dimanche 7 février 2016 - 10:10:48

Fichier

saner15.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

Collections

Citation

Nicolas Palix, Jean-Rémy Falleri, Julia Lawall. Improving pattern tracking with a language-aware tree differencing algorithm. SANER 2015 - 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering, Mar 2015, Montreal, Canada. pp.43-52, 2015, SANER 2015 - 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering. 〈http://saner.soccerlab.polymtl.ca/doku.php?id=en:start〉. 〈10.1109/SANER.2015.7081814〉. 〈hal-01213907〉

Partager

Métriques

Consultations de la notice

400

Téléchargements de fichiers

141