The paper presents a new corpus of dialogue speech designed specifically for research in the field of speech entrainment. Given that the degree of accommodation may depend on a number of social factors, the corpus is designed to encompass 5 types of relations between the interlocutors: those between siblings, close friends, strangers of the same gender, strangers of the other gender, strangers of which one has a higher job position and greater age. Another critical decision taken in this corpus is that in all these social settings one speaker is kept the same. This allows us to trace the changes in his/her speech depending on the interlocutor. The basic set of speakers consists of 10 pairs of same-gender siblings (including 4 pairs of identical twins) aged 23-40, and each of them was recorded in the 5 settings mentioned above. In total we obtained 90 dialogues of 25-60 minutes each. The speakers played a card game and a map game; they were recorded in a soundproof studio without being able to see each other due to a non-transparent screen between them. The corpus contains orthographic, phonetic and prosodic annotation and is segmented into turns and inter-pausal units.
|Title of host publication||Proceedings of the 12th conference on Language Resources and Evaluation (LREC'20)|
|Publisher||European Language Resources Association (ELRA)|
|Publication status||Published - 2020|
|Event||The 12th Edition of its Language Resources and Evaluation Conference: LREC 2020 - Marseille|
Duration: 11 May 2020 → 16 May 2020
|Conference||The 12th Edition of its Language Resources and Evaluation Conference|
|Abbreviated title||LREC 2020|
|Period||11/05/20 → 16/05/20|
Scopus subject areas
- Arts and Humanities(all)
Kachkovskaia, T., Chukaeva, T., Evdokimova, V., Kholiavin, P., Kriakina, N., Kocharov, D., Mamushina, A., Menshikova, A., & Zimina, S. (2020). SibLing Corpus of Russian Dialogue Speech Designed for Research on Speech Entrainment. In Proceedings of the 12th conference on Language Resources and Evaluation (LREC'20) (pp. 6558‑6563). European Language Resources Association (ELRA).