Standard

PSIMiner : A tool for mining rich abstract syntax trees from code. / Spirin, Egor; Bogomolov, Egor; Kovalenko, Vladimir; Bryksin, Timofey.

2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021). Institute of Electrical and Electronics Engineers Inc., 2021. p. 13-17 9463105 (IEEE International Working Conference on Mining Software Repositories).

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Harvard

Spirin, E, Bogomolov, E, Kovalenko, V & Bryksin, T 2021, PSIMiner: A tool for mining rich abstract syntax trees from code. in 2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021)., 9463105, IEEE International Working Conference on Mining Software Repositories, Institute of Electrical and Electronics Engineers Inc., pp. 13-17, 18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021, Virtual, Online, 17/05/21. https://doi.org/10.1109/MSR52588.2021.00014

APA

Spirin, E., Bogomolov, E., Kovalenko, V., & Bryksin, T. (2021). PSIMiner: A tool for mining rich abstract syntax trees from code. In 2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021) (pp. 13-17). [9463105] (IEEE International Working Conference on Mining Software Repositories). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/MSR52588.2021.00014

Vancouver

Spirin E, Bogomolov E, Kovalenko V, Bryksin T. PSIMiner: A tool for mining rich abstract syntax trees from code. In 2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021). Institute of Electrical and Electronics Engineers Inc. 2021. p. 13-17. 9463105. (IEEE International Working Conference on Mining Software Repositories). https://doi.org/10.1109/MSR52588.2021.00014

Author

Spirin, Egor ; Bogomolov, Egor ; Kovalenko, Vladimir ; Bryksin, Timofey. / PSIMiner : A tool for mining rich abstract syntax trees from code. 2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021). Institute of Electrical and Electronics Engineers Inc., 2021. pp. 13-17 (IEEE International Working Conference on Mining Software Repositories).

BibTeX

@inproceedings{d70a684c14c6416bac3fe7343a5d2001,
title = "PSIMiner: A tool for mining rich abstract syntax trees from code",
abstract = "The application of machine learning algorithms to source code has grown in the past years. Since these algorithms are quite sensitive to input data, it is not surprising that researchers experiment with input representations. Nowadays, a popular starting point to represent code is abstract syntax trees (ASTs). Abstract syntax trees have been used for a long time in various software engineering domains, and in particular in IDEs. The API of modern IDEs allows to manipulate and traverse ASTs, resolve references between code elements, etc. Such algorithms can enrich ASTs with new data and therefore may be useful in ML-based code analysis. In this work, we present PSIMiner - a tool for processing PSI trees from the IntelliJ Platform. PSI trees contain code syntax trees as well as functions to work with them, and therefore can be used to enrich code representation using static analysis algorithms of modern IDEs. To showcase this idea, we use our tool to infer types of identifiers in Java ASTs and extend the code2seq model for the method name prediction problem. ",
keywords = "Code representation, Data mining, Software Engineering",
author = "Egor Spirin and Egor Bogomolov and Vladimir Kovalenko and Timofey Bryksin",
note = "Publisher Copyright: {\textcopyright} 2021 IEEE.; 18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021 ; Conference date: 17-05-2021 Through 19-05-2021",
year = "2021",
month = may,
day = "1",
doi = "10.1109/MSR52588.2021.00014",
language = "English",
isbn = "9781728187105",
series = "IEEE International Working Conference on Mining Software Repositories",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "13--17",
booktitle = "2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021)",
address = "United States",

}

RIS

TY - GEN

T1 - PSIMiner

T2 - 18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021

AU - Spirin, Egor

AU - Bogomolov, Egor

AU - Kovalenko, Vladimir

AU - Bryksin, Timofey

N1 - Publisher Copyright: © 2021 IEEE.

PY - 2021/5/1

Y1 - 2021/5/1

N2 - The application of machine learning algorithms to source code has grown in the past years. Since these algorithms are quite sensitive to input data, it is not surprising that researchers experiment with input representations. Nowadays, a popular starting point to represent code is abstract syntax trees (ASTs). Abstract syntax trees have been used for a long time in various software engineering domains, and in particular in IDEs. The API of modern IDEs allows to manipulate and traverse ASTs, resolve references between code elements, etc. Such algorithms can enrich ASTs with new data and therefore may be useful in ML-based code analysis. In this work, we present PSIMiner - a tool for processing PSI trees from the IntelliJ Platform. PSI trees contain code syntax trees as well as functions to work with them, and therefore can be used to enrich code representation using static analysis algorithms of modern IDEs. To showcase this idea, we use our tool to infer types of identifiers in Java ASTs and extend the code2seq model for the method name prediction problem.

AB - The application of machine learning algorithms to source code has grown in the past years. Since these algorithms are quite sensitive to input data, it is not surprising that researchers experiment with input representations. Nowadays, a popular starting point to represent code is abstract syntax trees (ASTs). Abstract syntax trees have been used for a long time in various software engineering domains, and in particular in IDEs. The API of modern IDEs allows to manipulate and traverse ASTs, resolve references between code elements, etc. Such algorithms can enrich ASTs with new data and therefore may be useful in ML-based code analysis. In this work, we present PSIMiner - a tool for processing PSI trees from the IntelliJ Platform. PSI trees contain code syntax trees as well as functions to work with them, and therefore can be used to enrich code representation using static analysis algorithms of modern IDEs. To showcase this idea, we use our tool to infer types of identifiers in Java ASTs and extend the code2seq model for the method name prediction problem.

KW - Code representation

KW - Data mining

KW - Software Engineering

UR - http://www.scopus.com/inward/record.url?scp=85113675283&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/fb7e138b-f5a8-3789-b01d-3940aa734e48/

U2 - 10.1109/MSR52588.2021.00014

DO - 10.1109/MSR52588.2021.00014

M3 - Conference contribution

AN - SCOPUS:85113675283

SN - 9781728187105

T3 - IEEE International Working Conference on Mining Software Repositories

SP - 13

EP - 17

BT - 2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021)

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 17 May 2021 through 19 May 2021

ER -

ID: 87612317