Construction of a Russian Paraphrase Corpus

Standard

Construction of a Russian Paraphrase Corpus : Unsupervised Paraphrase Extraction. / Pronoza, Ekaterina ; Yagunova, Elena; Pronoza, Anton.

INFORMATION RETRIEVAL, (RUSSIR 2015). ed. / P Braslavski; Markov; P Pardalos; Y Volkovich; DI Ignatov; S Koltsov; O Koltsova. Springer Nature, 2016. p. 146-157 (Communications in Computer and Information Science; Vol. 573).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review

Harvard

Pronoza, E , Yagunova, E & Pronoza, A 2016, Construction of a Russian Paraphrase Corpus: Unsupervised Paraphrase Extraction. in P Braslavski, Markov, P Pardalos, Y Volkovich, DI Ignatov, S Koltsov & O Koltsova (eds), INFORMATION RETRIEVAL, (RUSSIR 2015). Communications in Computer and Information Science, vol. 573, Springer Nature, pp. 146-157, 9th Russian Summer School in Information Retrieval (RuSSIR), St Petersburg, 24/08/15. https://doi.org/10.1007/978-3-319-41718-9_8

APA

Pronoza, E., Yagunova, E., & Pronoza, A. (2016). Construction of a Russian Paraphrase Corpus: Unsupervised Paraphrase Extraction. In P. Braslavski, Markov, P. Pardalos, Y. Volkovich, DI. Ignatov, S. Koltsov, & O. Koltsova (Eds.), INFORMATION RETRIEVAL, (RUSSIR 2015) (pp. 146-157). (Communications in Computer and Information Science; Vol. 573). Springer Nature. https://doi.org/10.1007/978-3-319-41718-9_8

Vancouver

Pronoza E , Yagunova E, Pronoza A. Construction of a Russian Paraphrase Corpus: Unsupervised Paraphrase Extraction. In Braslavski P, Markov, Pardalos P, Volkovich Y, Ignatov DI, Koltsov S, Koltsova O, editors, INFORMATION RETRIEVAL, (RUSSIR 2015). Springer Nature. 2016. p. 146-157. (Communications in Computer and Information Science). https://doi.org/10.1007/978-3-319-41718-9_8

Author

Pronoza, Ekaterina ; Yagunova, Elena ; Pronoza, Anton. / Construction of a Russian Paraphrase Corpus : Unsupervised Paraphrase Extraction. INFORMATION RETRIEVAL, (RUSSIR 2015). editor / P Braslavski ; Markov ; P Pardalos ; Y Volkovich ; DI Ignatov ; S Koltsov ; O Koltsova. Springer Nature, 2016. pp. 146-157 (Communications in Computer and Information Science).

BibTeX

@inproceedings{6a585fc84e7042f0ae8a0b98ec1ace24,

title = "Construction of a Russian Paraphrase Corpus: Unsupervised Paraphrase Extraction",

abstract = "This paper presents a crowdsourcing project on the creation of a publicly available corpus of sentential paraphrases for Russian. Collected from the news headlines, such corpus could be applied for information extraction and text summarization. We collect news headlines from different agencies in real-time; paraphrase candidates are extracted from the headlines using an unsupervised matrix similarity metric. We provide user-friendly online interface for crowdsourced annotation which is available at paraphraser. ru. There are 5181 annotated sentence pairs at the moment, with 4758 of them included in the corpus. The annotation process is going on and the current version of the corpus is freely available at http://paraphraser.ru.",

keywords = "Russian paraphrase corpus, Lexical similarity metric, Unsupervised paraphrase extraction, Crowdsourcing",

author = "Ekaterina Pronoza and Elena Yagunova and Anton Pronoza",

year = "2016",

doi = "10.1007/978-3-319-41718-9_8",

language = "Английский",

isbn = "978-3-319-41717-2",

series = "Communications in Computer and Information Science",

publisher = "Springer Nature",

pages = "146--157",

editor = "P Braslavski and Markov and P Pardalos and Y Volkovich and DI Ignatov and S Koltsov and O Koltsova",

booktitle = "INFORMATION RETRIEVAL, (RUSSIR 2015)",

address = "Германия",

note = "null ; Conference date: 24-08-2015 Through 28-08-2015",

}

RIS

TY - GEN

T1 - Construction of a Russian Paraphrase Corpus

AU - Pronoza, Ekaterina

AU - Yagunova, Elena

AU - Pronoza, Anton

PY - 2016

Y1 - 2016

N2 - This paper presents a crowdsourcing project on the creation of a publicly available corpus of sentential paraphrases for Russian. Collected from the news headlines, such corpus could be applied for information extraction and text summarization. We collect news headlines from different agencies in real-time; paraphrase candidates are extracted from the headlines using an unsupervised matrix similarity metric. We provide user-friendly online interface for crowdsourced annotation which is available at paraphraser. ru. There are 5181 annotated sentence pairs at the moment, with 4758 of them included in the corpus. The annotation process is going on and the current version of the corpus is freely available at http://paraphraser.ru.

AB - This paper presents a crowdsourcing project on the creation of a publicly available corpus of sentential paraphrases for Russian. Collected from the news headlines, such corpus could be applied for information extraction and text summarization. We collect news headlines from different agencies in real-time; paraphrase candidates are extracted from the headlines using an unsupervised matrix similarity metric. We provide user-friendly online interface for crowdsourced annotation which is available at paraphraser. ru. There are 5181 annotated sentence pairs at the moment, with 4758 of them included in the corpus. The annotation process is going on and the current version of the corpus is freely available at http://paraphraser.ru.

KW - Russian paraphrase corpus

KW - Lexical similarity metric

KW - Unsupervised paraphrase extraction

KW - Crowdsourcing

U2 - 10.1007/978-3-319-41718-9_8

DO - 10.1007/978-3-319-41718-9_8

M3 - статья в сборнике материалов конференции

SN - 978-3-319-41717-2

T3 - Communications in Computer and Information Science

SP - 146

EP - 157

BT - INFORMATION RETRIEVAL, (RUSSIR 2015)

A2 - Braslavski, P

A2 - Markov, null

A2 - Pardalos, P

A2 - Volkovich, Y

A2 - Ignatov, DI

A2 - Koltsov, S

A2 - Koltsova, O

PB - Springer Nature

Y2 - 24 August 2015 through 28 August 2015

ER -

ID: 89669620