Research output: Contribution to journal › Article › peer-review
Prosodic boundary detection using syntactic and acoustic information. / Kocharov, D.; Kachkovskaia, T.; Skrelin, P.
In: Computer Speech and Language, Vol. 53, 01.01.2019, p. 231-241.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Prosodic boundary detection using syntactic and acoustic information
AU - Kocharov, D.
AU - Kachkovskaia, T.
AU - Skrelin, P
PY - 2019/1/1
Y1 - 2019/1/1
N2 - This paper presents a two-stage procedure for automatic prosodic boundary detection in Russian based on textual and acoustic data. The key idea of the method is (1) to predict all potential prosodic boundaries based on syntax and (2) among these potential boundaries, to choose those which are marked acoustically. For the first stage we developed a system which predicted a potential boundary whenever two adjacent words were not connected with each other in terms of syntax; for this we used a dependency tree parser and added several simple rules. At the second stage we run a random forest classifier to detect the actual prosodic boundaries using a small set of acoustic features. Of all the observed prosodic features pause duration worked best, and for some speakers it could be used as the only acoustic cue with no change in efficiency. For other speakers, however, other features were useful, such as tempo and amplitude resets or F 0 range, and the choice of the features was speaker-dependent. In the end the procedure worked with the F 1 measure of 0.91, recall of 0.90 and precision of 0.93, which is the best published result for Russian.
AB - This paper presents a two-stage procedure for automatic prosodic boundary detection in Russian based on textual and acoustic data. The key idea of the method is (1) to predict all potential prosodic boundaries based on syntax and (2) among these potential boundaries, to choose those which are marked acoustically. For the first stage we developed a system which predicted a potential boundary whenever two adjacent words were not connected with each other in terms of syntax; for this we used a dependency tree parser and added several simple rules. At the second stage we run a random forest classifier to detect the actual prosodic boundaries using a small set of acoustic features. Of all the observed prosodic features pause duration worked best, and for some speakers it could be used as the only acoustic cue with no change in efficiency. For other speakers, however, other features were useful, such as tempo and amplitude resets or F 0 range, and the choice of the features was speaker-dependent. In the end the procedure worked with the F 1 measure of 0.91, recall of 0.90 and precision of 0.93, which is the best published result for Russian.
KW - Prosodic phrasing
KW - Automatic boundary detection
KW - Dependency parsing
KW - Acoustic feature
KW - Russian
KW - Acoustic feature
KW - Automatic boundary detection
KW - Dependency parsing
KW - Prosodic phrasing
KW - Russian
UR - http://www.scopus.com/inward/record.url?scp=85052865816&partnerID=8YFLogxK
U2 - 10.1016/j.csl.2018.07.001
DO - 10.1016/j.csl.2018.07.001
M3 - Article
VL - 53
SP - 231
EP - 241
JO - Computer Speech and Language
JF - Computer Speech and Language
SN - 0885-2308
ER -
ID: 33862145