A ML-LLM pairing for better code comment classification - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2023

A ML-LLM pairing for better code comment classification

Résumé

The "Information Retrieval in Software Engineering (IRSE)" at FIRE 2023 shared task introduces code comment classification, a challenging task that pairs a code snippet with a comment that should be evaluated as either useful or not useful to the understanding of the relevant code. We answer the code comment classification shared task challenge by providing a two-fold evaluation: from an algorithmic perspective, we compare the performance of classical machine learning systems and complement our evaluations from a data-driven perspective by generating additional data with the help of large language model (LLM) prompting to measure the potential increase in performance. Our best model, which took second place in the shared task, is a Neural Network with a Macro-F1 score of 88.401% on the provided seed data and a 1.5% overall increase in performance on the data generated by the LLM.
Fichier principal
Vignette du fichier
FIRE_IRSE_2023.pdf (810.82 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Licence : CC BY - Paternité

Dates et versions

hal-04311401 , version 1 (28-11-2023)

Licence

Paternité

Identifiants

Citer

Hanna Abi Akl. A ML-LLM pairing for better code comment classification. FIRE (Forum for Information Retrieval Evaluation) 2023, Prasenjit Majumder; Kripabandhu Ghosh; Thomas Mandl; Debasis Ganguly; Parth Gupta; Bhaskar Mitra; Srijoni Majumdar; Jyoti D Pawar; Pabitra Mitra; Parth Mehta, Dec 2023, Goa, India. ⟨hal-04311401⟩
164 Consultations
39 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More