Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Oral text is certainly discrete. It is built of “small bricks”, units of not only lexical but also the higher syntactical level. Common syntagmatic pauses, hesitative pauses such as physical (unfilled ones including breaks of clauses), sound pauses (e-e, m-m), and verbal (vot, kak eto, nu, znachit etc.) are markers of this discreetness. However, that reveals neither syntagma nor sentence as a unit to describe a syntactic structure of an oral text. Any type of pauses may occur in any place of an audio sequence. Thus, the search of sentences in spontaneous speech is quite complicated. In order to obtain such units a methodic of coercive punctuation that was used for marking the spontaneous monologues from the collection of oral texts named «Balanced Annotated Textotec» could be offered. The testee (philology experts) were asked to mark ends of the sentences by putting a period in the transcripts where neither pauses nor punctuation had been marked. The testee could only rely on the syntactic structure of the text and the connection between words and predicate centers. Involving more than twenty experts in an experiment provides more statistically accurate results. In this work we describe the results of our experiment and discuss further perspectives how those results can be used for automatic search of sentence boundaries in spontaneous speech.
Original language | English |
---|---|
Title of host publication | Speech and Computer - 19th International Conference, SPECOM 2017, Proceedings |
Editors | Alexey Karpov, Iosif Mporas, Rodmonga Potapova |
Publisher | Springer Nature |
Pages | 456-463 |
Number of pages | 8 |
ISBN (Print) | 9783319664286 |
DOIs | |
State | Published - 1 Jan 2017 |
Event | 19th International Conference on Speech and Computer - Hatfield, United Kingdom Duration: 11 Sep 2017 → 15 Sep 2017 |
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 10458 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference | 19th International Conference on Speech and Computer |
---|---|
Abbreviated title | SPECOM 2017 |
Country/Territory | United Kingdom |
City | Hatfield |
Period | 11/09/17 → 15/09/17 |
ID: 50412351