DOI

Applying machine learning to tasks that operate with code changes requires their numerical representation. In this work, we propose an approach for obtaining such representations during pre-training and evaluate them on two different downstream tasks - applying changes to code and commit message generation. During pre-training, the model learns to apply the given code change in a correct way. This task requires only code changes themselves, which makes it unsupervised. In the task of applying code changes, our model outperforms baseline models by 5.9 percentage points in accuracy. As for the commit message generation, our model demonstrated the same results as supervised models trained for this specific task, which indicates that it can encode code changes well and can be improved in the future by pre-training on a larger dataset of easily gathered code changes.

Язык оригиналаанглийский
Название основной публикацииMaLTESQuE 2021 - Proceedings of the 5th International Workshop on Machine Learning Techniques for Software Quality Evolution, co-located with ESEC/FSE 2021
РедакторыApostolos Ampatzoglou, Daniel Feitosa, Gemma Catolino, Valentina Lenarduzzi
ИздательAssociation for Computing Machinery
Страницы7-12
Число страниц6
ISBN (электронное издание)9781450386258
DOI
СостояниеОпубликовано - 23 авг 2021
Событие5th International Workshop on Machine Learning Techniques for Software Quality Evolution, MaLTESQuE 2021, co-located with the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021 - Virtual, Online, Греция
Продолжительность: 23 авг 2021 → …

конференция

конференция5th International Workshop on Machine Learning Techniques for Software Quality Evolution, MaLTESQuE 2021, co-located with the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021
Страна/TерриторияГреция
ГородVirtual, Online
Период23/08/21 → …

    Предметные области Scopus

  • Искусственный интеллект
  • Прикладные компьютерные науки
  • Программный продукт
  • Безопасность, риски, качество и надежность

ID: 87612403