Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
PathMiner: a library for mining of path-based representations of code. / Kovalenko, Vladimir; Bogomolov, Egor; Bryksin, Timofey ; Bacchelli, Alberto.
Proceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019. Vol. 2019 Institute of Electrical and Electronics Engineers Inc., 2019. p. 13-17 8816777.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
}
TY - GEN
T1 - PathMiner: a library for mining of path-based representations of code
AU - Kovalenko, Vladimir
AU - Bogomolov, Egor
AU - Bryksin, Timofey
AU - Bacchelli, Alberto
N1 - Publisher Copyright: © 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - One recent, significant advance in modeling source code for machine learning algorithms has been the introduction of path-based representation - an approach consisting in representing a snippet of code as a collection of paths from its syntax tree. Such representation efficiently captures the structure of code, which, in turn, carries its semantics and other information. Building the path-based representation involves parsing the code and extracting the paths from its syntax tree; these steps build up to a substantial technical job. With no common reusable toolkit existing for this task, the burden of mining diverts the focus of researchers from the essential work and hinders newcomers in the field of machine learning on code. In this paper, we present PathMiner - an open-source library for mining path-based representations of code. PathMiner is fast, flexible, well-tested, and easily extensible to support input code in any common programming language.
AB - One recent, significant advance in modeling source code for machine learning algorithms has been the introduction of path-based representation - an approach consisting in representing a snippet of code as a collection of paths from its syntax tree. Such representation efficiently captures the structure of code, which, in turn, carries its semantics and other information. Building the path-based representation involves parsing the code and extracting the paths from its syntax tree; these steps build up to a substantial technical job. With no common reusable toolkit existing for this task, the burden of mining diverts the focus of researchers from the essential work and hinders newcomers in the field of machine learning on code. In this paper, we present PathMiner - an open-source library for mining path-based representations of code. PathMiner is fast, flexible, well-tested, and easily extensible to support input code in any common programming language.
KW - Ast path
KW - Code2Vec
KW - Machine learning on code
KW - Mining tool
KW - Path based representation
UR - https://2019.msrconf.org/details/msr-2019-papers/38/PathMiner-A-Library-for-Mining-of-Path-Based-Representations-of-Code
UR - http://www.scopus.com/inward/record.url?scp=85072347462&partnerID=8YFLogxK
U2 - 10.1109/MSR.2019.00013
DO - 10.1109/MSR.2019.00013
M3 - Conference contribution
SN - 9781728134123
VL - 2019
SP - 13
EP - 17
BT - Proceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 16th International Conference on Mining Software Repositories
Y2 - 26 May 2019 through 27 May 2019
ER -
ID: 43773778