S3M: Siamese Stack (Trace) Similarity Measure

Aleksandr Khvorov, Roman Vasiliev, George Chernishev, Irving Muller Rodrigues, Dmitrij Koznov, Nikita Povarov

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Abstract

Automatic crash reporting systems have become a de-facto standard in software development. These systems monitor target software, and if a crash occurs they send details to a backend application. Later on, these reports are aggregated and used in the development process to 1) understand whether it is a new or an existing issue, 2) assign these bugs to appropriate developers, and 3) gain a general overview of the application's bug landscape. The efficiency of report aggregation and subsequent operations heavily depends on the quality of the report similarity metric. However, a distinctive feature of this kind of report is that no textual input from the user (i.e., bug description) is available: it contains only stack trace information.In this paper, we present S3M ("extreme") - the first approach to computing stack trace similarity based on deep learning. It is based on a siamese architecture that uses a biLSTM encoder and a fully-connected classifier to compute similarity. Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset. Additionally, we review the impact of stack trace trimming on the quality of the results.

Original languageEnglish
Title of host publication2021 IEEE/ACM 18th International Conference on Mining Software Repositories, MSR 2021
Subtitle of host publicationProceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages266-270
ISBN (Electronic)9781728187105
ISBN (Print)978-1-6654-2985-6
DOIs
StatePublished - May 2021
Event18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021 - Virtual, Online
Duration: 17 May 202119 May 2021

Conference

Conference18th IEEE/ACM International Conference on Mining Software Repositories, MSR 2021
CityVirtual, Online
Period17/05/2119/05/21

Scopus subject areas

  • Software
  • Safety, Risk, Reliability and Quality

Keywords

  • Automatic Crash Reporting
  • Crash Report
  • Deduplication
  • Deep Learning
  • Stack Trace

Fingerprint

Dive into the research topics of 'S3M: Siamese Stack (Trace) Similarity Measure'. Together they form a unique fingerprint.

Cite this