Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Combining syntactic and acoustic features for prosodic boundary detection in Russian. / Kocharov, D.; Kachkovskaia, T.; Mirzagitova, A.; Skrelin, P.
International Conference on Statistical Language and Speech Processing. Springer Nature, 2016. p. 68-79.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
}
TY - GEN
T1 - Combining syntactic and acoustic features for prosodic boundary detection in Russian
AU - Kocharov, D.
AU - Kachkovskaia, T.
AU - Mirzagitova, A.
AU - Skrelin, P.
N1 - Conference code: 4
PY - 2016
Y1 - 2016
N2 - This paper presents a two-step method of automatic prosodic boundary detection using both textual and acoustic features. Firstly, we predict possible boundary positions using textual features; secondly, we detect the actual boundaries at the predicted positions using acoustic features. For evaluation of the algorithms we use a 26-h subcorpus of CORPRES, a prosodically annotated corpus of Russian read speech. We have also conducted two independent experiments using acoustic features and textual features separately. Acoustic features alone enable to achieve the F1 measure of 0.85, precision of 0.94, recall of 0.78. Textual features alone work with the F1 measure of 0.84, precision of 0.84, recall of 0.83. The proposed two-step approach combining the two groups of features yields the efficiency of 0.90, recall of 0.85 and precision of 0.99. It preserves the high recall provided by textual information and the high precision achieved using acoustic information. This is the best published result for Russian. © Spri
AB - This paper presents a two-step method of automatic prosodic boundary detection using both textual and acoustic features. Firstly, we predict possible boundary positions using textual features; secondly, we detect the actual boundaries at the predicted positions using acoustic features. For evaluation of the algorithms we use a 26-h subcorpus of CORPRES, a prosodically annotated corpus of Russian read speech. We have also conducted two independent experiments using acoustic features and textual features separately. Acoustic features alone enable to achieve the F1 measure of 0.85, precision of 0.94, recall of 0.78. Textual features alone work with the F1 measure of 0.84, precision of 0.84, recall of 0.83. The proposed two-step approach combining the two groups of features yields the efficiency of 0.90, recall of 0.85 and precision of 0.99. It preserves the high recall provided by textual information and the high precision achieved using acoustic information. This is the best published result for Russian. © Spri
U2 - 10.1007/978-3-319-45925-7_6
DO - 10.1007/978-3-319-45925-7_6
M3 - Conference contribution
SN - 978-331945924-0
SP - 68
EP - 79
BT - International Conference on Statistical Language and Speech Processing
PB - Springer Nature
T2 - International Conference on Statistical Language and Speech Processing
Y2 - 11 October 2016 through 12 October 2016
ER -
ID: 7595047