DOI

The application of machine learning algorithms to source code has grown in the past years. Since these algorithms are quite sensitive to input data, it is not surprising that researchers experiment with input representations. Nowadays, a popular starting point to represent code is abstract syntax trees (ASTs). Abstract syntax trees have been used for a long time in various software engineering domains, and in particular in IDEs. The API of modern IDEs allows to manipulate and traverse ASTs, resolve references between code elements, etc. Such algorithms can enrich ASTs with new data and therefore may be useful in ML-based code analysis. In this work, we present PSIMiner - a tool for processing PSI trees from the IntelliJ Platform. PSI trees contain code syntax trees as well as functions to work with them, and therefore can be used to enrich code representation using static analysis algorithms of modern IDEs. To showcase this idea, we use our tool to infer types of identifiers in Java ASTs and extend the code2seq model for the method name prediction problem.

Язык оригиналаанглийский
Название основной публикации2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021)
ИздательInstitute of Electrical and Electronics Engineers Inc.
Страницы13-17
Число страниц5
ISBN (электронное издание)9781728187105
ISBN (печатное издание)9781728187105
DOI
СостояниеОпубликовано - 1 мая 2021
Событие18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021 - Virtual, Online
Продолжительность: 17 мая 202119 мая 2021

Серия публикаций

НазваниеIEEE International Working Conference on Mining Software Repositories
ИздательIEEE COMPUTER SOC
ISSN (печатное издание)2160-1852

конференция

конференция18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021
ГородVirtual, Online
Период17/05/2119/05/21

    Предметные области Scopus

  • Программный продукт
  • Безопасность, риски, качество и надежность

ID: 87612317