Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
In this paper, we describe the second stage of the study aimed at describing the factors that influence the phonetic reduction of words in Russian speech using machine learning algorithms. We discuss the limitations of the first stage of our study and try to overcome some of them by increasing the dataset and using new algorithms such as random forest, gradient boosting, and perceptron. We used the texts from the Corpus of Russian Speech as the data. The dataset was divided into two separate datasets: one consisted of single words and the other contained multiword units from our corpus. According to the results, for single words the most important features turned out to be the number of syllables and whether the word is an adjective as they were chosen by all algorithms. For the multiword units, the main features were the number of syllables, frequency in Russian spoken texts (in ipm), and token frequency in a given text. In our further research, we are going to expand the dataset and look closer on such features as text type and token frequency in a given text.
Original language | English |
---|---|
Title of host publication | Speech and Computer - 23rd International Conference, SPECOM 2021, Proceedings |
Editors | Alexey Karpov, Rodmonga Potapova |
Publisher | Springer Nature |
Pages | 146-156 |
Number of pages | 11 |
ISBN (Print) | 9783030878016 |
DOIs | |
State | Published - 2021 |
Event | 23rd International Conference on Speech and Computer, SPECOM 2021 - Virtual, Online, Russian Federation Duration: 27 Sep 2021 → 30 Sep 2021 Conference number: 23 http://specom.nw.ru/2021/ |
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 12997 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference | 23rd International Conference on Speech and Computer, SPECOM 2021 |
---|---|
Abbreviated title | SPECOM 2021 |
Country/Territory | Russian Federation |
City | Virtual, Online |
Period | 27/09/21 → 30/09/21 |
Internet address |
ID: 87566335