TraceSim: A method for calculating stack trace similarity

Roman Vasiliev, Dmitrij Koznov, George Chernishev, Aleksandr Khvorov, Dmitry Luciv, Nikita Povarov

Результат исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

Аннотация

Many contemporary software products have subsystems for automatic crash reporting. However, it is well-known that the same bug can produce slightly different reports. To manage this problem, reports are usually grouped, often manually by developers. Manual triaging, however, becomes infeasible for products that have large userbases, which is the reason for many different approaches to automating this task. Moreover, it is important to improve quality of triaging due to a large volume of reports that needs to be processed properly. Therefore, even a relatively small improvement could play a significant role in the overall accuracy of report bucketing. The majority of existing studies use some kind of a stack trace similarity metric, either based on information retrieval techniques or string matching methods. However, it should be stressed that the quality of triaging is still insufficient. In this paper, we describe TraceSim-a novel approach to this problem which combines TF-IDF, Levenshtein distance, and machine learning to construct a similarity metric. Our metric has been implemented inside an industrial-grade report triaging system. The evaluation on a manually labeled dataset shows significantly better results compared to baseline approaches.

Язык оригиналаанглийский
Название основной публикацииMaLTeSQuE 2020 - Proceedings of the 4th ACM SIGSOFT International Workshop on Machine-Learning Techniques for Software-Quality Evaluation, Co-located with ESEC/FSE 2020
РедакторыFoutse Khomh, Pasquale Salza, Gemma Catolino
ИздательAssociation for Computing Machinery
Страницы25-30
Число страниц6
ISBN (электронное издание)9781450381246
DOI
СостояниеОпубликовано - 13 ноя 2020
Событие4th ACM SIGSOFT International Workshop on Machine-Learning Techniques for Software-Quality Evaluation, MaLTeSQuE 2020, co-located with the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020 - Virtual, Online, Соединенные Штаты Америки
Продолжительность: 13 ноя 2020 → …

Серия публикаций

НазваниеMaLTeSQuE 2020 - Proceedings of the 4th ACM SIGSOFT International Workshop on Machine-Learning Techniques for Software-Quality Evaluation, Co-located with ESEC/FSE 2020

конференция

конференция4th ACM SIGSOFT International Workshop on Machine-Learning Techniques for Software-Quality Evaluation, MaLTeSQuE 2020, co-located with the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020
СтранаСоединенные Штаты Америки
ГородVirtual, Online
Период13/11/20 → …

Предметные области Scopus

  • Искусственный интеллект
  • Прикладные компьютерные науки
  • Программный продукт
  • Безопасность, риски, качество и надежность

Fingerprint

Подробные сведения о темах исследования «TraceSim: A method for calculating stack trace similarity». Вместе они формируют уникальный семантический отпечаток (fingerprint).

Цитировать