Topic modeling has emerged over the last decade as a powerful tool for analyzing large text corpora, including Web-based usergenerated texts. Topic stability, however, remains a concern: topic models have a very complex optimization landscape with many local maxima, and even different runs of the same model yield very different topics. Aiming to add stability to topic modeling, we propose an approach to topic modeling based on local density regularization, where words in a local context window of a given word have higher probabilities to get the same topic as that word. We compare several models with local density regularizers and show how they can improve topic stability while remaining on par with classical models in terms of quality metrics.

Original languageEnglish
Title of host publicationInternet Science - 3rd International Conference, INSCI 2016, Proceedings
EditorsAnna Satsiou, Yanina Welp, Thanassis Tiropanis, Dominic DiFranzo, Ioannis Stavrakakis, Franco Bagnoli, Paolo Nesi, Giovanna Pacini
PublisherSpringer Nature
Pages176-188
Number of pages13
ISBN (Print)9783319459813
DOIs
StatePublished - 2016
Event3rd International Conference on Internet Science, INSCI 2016 - Florence, Italy
Duration: 12 Sep 201614 Sep 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9934 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd International Conference on Internet Science, INSCI 2016
Country/TerritoryItaly
CityFlorence
Period12/09/1614/09/16

    Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

    Research areas

  • Gibbs sampling, Latent Dirichlet allocation, Topic modeling

ID: 7604879