Prosodic boundary detection using syntactic and acoustic information

Research output

Abstract

This paper presents a two-stage procedure for automatic prosodic boundary detection in Russian based on textual and acoustic data. The key idea of the method is (1) to predict all potential prosodic boundaries based on syntax and (2) among these potential boundaries, to choose those which are marked acoustically. For the first stage we developed a system which predicted a potential boundary whenever two adjacent words were not connected with each other in terms of syntax; for this we used a dependency tree parser and added several simple rules. At the second stage we run a random forest classifier to detect the actual prosodic boundaries using a small set of acoustic features. Of all the observed prosodic features pause duration worked best, and for some speakers it could be used as the only acoustic cue with no change in efficiency. For other speakers, however, other features were useful, such as tempo and amplitude resets or F 0 range, and the choice of the features was speaker-dependent. In the end the procedure worked with the F 1 measure of 0.91, recall of 0.90 and precision of 0.93, which is the best published result for Russian.

Original languageEnglish
Pages (from-to)231-241
Number of pages11
JournalComputer Speech and Language
Volume53
DOIs
Publication statusPublished - 1 Jan 2019

Fingerprint

Boundary Detection
Syntactics
Acoustics
Two-stage Procedure
Random Forest
Classifiers
Adjacent
Choose
Classifier
Predict
Syntax
Prosodic Boundaries
Dependent
Range of data

Scopus subject areas

  • Arts and Humanities(all)
  • Software
  • Theoretical Computer Science
  • Human-Computer Interaction

Cite this

@article{110e529bebe84ba4bab532c650e4e32d,
title = "Prosodic boundary detection using syntactic and acoustic information",
abstract = "This paper presents a two-stage procedure for automatic prosodic boundary detection in Russian based on textual and acoustic data. The key idea of the method is (1) to predict all potential prosodic boundaries based on syntax and (2) among these potential boundaries, to choose those which are marked acoustically. For the first stage we developed a system which predicted a potential boundary whenever two adjacent words were not connected with each other in terms of syntax; for this we used a dependency tree parser and added several simple rules. At the second stage we run a random forest classifier to detect the actual prosodic boundaries using a small set of acoustic features. Of all the observed prosodic features pause duration worked best, and for some speakers it could be used as the only acoustic cue with no change in efficiency. For other speakers, however, other features were useful, such as tempo and amplitude resets or F 0 range, and the choice of the features was speaker-dependent. In the end the procedure worked with the F 1 measure of 0.91, recall of 0.90 and precision of 0.93, which is the best published result for Russian.",
keywords = "Prosodic phrasing, Automatic boundary detection, Dependency parsing, Acoustic feature, Russian, Acoustic feature, Automatic boundary detection, Dependency parsing, Prosodic phrasing, Russian",
author = "D. Kocharov and T. Kachkovskaia and P Skrelin",
year = "2019",
month = "1",
day = "1",
doi = "10.1016/j.csl.2018.07.001",
language = "English",
volume = "53",
pages = "231--241",
journal = "Computer Speech and Language",
issn = "0885-2308",
publisher = "Elsevier",

}

TY - JOUR

T1 - Prosodic boundary detection using syntactic and acoustic information

AU - Kocharov, D.

AU - Kachkovskaia, T.

AU - Skrelin, P

PY - 2019/1/1

Y1 - 2019/1/1

N2 - This paper presents a two-stage procedure for automatic prosodic boundary detection in Russian based on textual and acoustic data. The key idea of the method is (1) to predict all potential prosodic boundaries based on syntax and (2) among these potential boundaries, to choose those which are marked acoustically. For the first stage we developed a system which predicted a potential boundary whenever two adjacent words were not connected with each other in terms of syntax; for this we used a dependency tree parser and added several simple rules. At the second stage we run a random forest classifier to detect the actual prosodic boundaries using a small set of acoustic features. Of all the observed prosodic features pause duration worked best, and for some speakers it could be used as the only acoustic cue with no change in efficiency. For other speakers, however, other features were useful, such as tempo and amplitude resets or F 0 range, and the choice of the features was speaker-dependent. In the end the procedure worked with the F 1 measure of 0.91, recall of 0.90 and precision of 0.93, which is the best published result for Russian.

AB - This paper presents a two-stage procedure for automatic prosodic boundary detection in Russian based on textual and acoustic data. The key idea of the method is (1) to predict all potential prosodic boundaries based on syntax and (2) among these potential boundaries, to choose those which are marked acoustically. For the first stage we developed a system which predicted a potential boundary whenever two adjacent words were not connected with each other in terms of syntax; for this we used a dependency tree parser and added several simple rules. At the second stage we run a random forest classifier to detect the actual prosodic boundaries using a small set of acoustic features. Of all the observed prosodic features pause duration worked best, and for some speakers it could be used as the only acoustic cue with no change in efficiency. For other speakers, however, other features were useful, such as tempo and amplitude resets or F 0 range, and the choice of the features was speaker-dependent. In the end the procedure worked with the F 1 measure of 0.91, recall of 0.90 and precision of 0.93, which is the best published result for Russian.

KW - Prosodic phrasing

KW - Automatic boundary detection

KW - Dependency parsing

KW - Acoustic feature

KW - Russian

KW - Acoustic feature

KW - Automatic boundary detection

KW - Dependency parsing

KW - Prosodic phrasing

KW - Russian

UR - http://www.scopus.com/inward/record.url?scp=85052865816&partnerID=8YFLogxK

U2 - 10.1016/j.csl.2018.07.001

DO - 10.1016/j.csl.2018.07.001

M3 - Article

VL - 53

SP - 231

EP - 241

JO - Computer Speech and Language

JF - Computer Speech and Language

SN - 0885-2308

ER -