In this study we compare two semantic relatedness algorithms, namely, Explicit Semantic Analysis (ESA) and Word2Vec. ESA represents text meaning in a high-dimensional space of concepts derived from Wikipedia. Word2Vec generates distributed vector representations from large text corpora). Experiments were carried out on the Russian paraphrase corpus of news titles and Russian ParaPlag paraphrase corpus. The paper contains thorough analysis of results and evaluation procedure.

Original languageEnglish
Title of host publicationDigital Transformation and Global Society - Third International Conference, DTGS 2018, Revised Selected Papers
EditorsDaniel A. Alexandrov, Yury Kabanov, Olessia Koltsova, Alexander V. Boukhanovsky, Andrei V. Chugunov
PublisherSpringer Nature
Pages350-360
Number of pages11
ISBN (Print)9783030028459
DOIs
StatePublished - 2018
Event3rd International Conference on Digital Transformation and Global Society, DTGS 2018 - Университет ИТМО, St. Petersburg, Russian Federation
Duration: 30 May 20182 Jun 2018
http://dtgs.ifmo.ru/

Publication series

NameCommunications in Computer and Information Science
Volume859
ISSN (Print)1865-0929

Conference

Conference3rd International Conference on Digital Transformation and Global Society, DTGS 2018
Abbreviated titleDTGS - 2018
Country/TerritoryRussian Federation
CitySt. Petersburg
Period30/05/182/06/18
Internet address

    Research areas

  • Explicit semantic analysis, Russian, Text relatedness, Word2vec

    Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)

ID: 37682445