The article examines the perception and extraction of keyphrases in both written and spoken text. Experiments were performed on the dataset including transcripts and audio recordings of lectures by Russian-speaking participants of the project “Postnauka”. The results show that automated methods for keyphrase extraction have limited accuracy, with statistical algorithms performing the worst and generative AI models, such as ChatGPT, showing a closer resemblance to human perception. Additionally, while there is some overlap between keyphrases extracted from written and oral texts, spoken text presents greater variability. Experiments using synthesized speech indicate that listeners rely heavily on content, rather than acoustic cues, when understanding spoken text. Acoustic analysis reveals that keyphrases are distinguished by longer duration, wider pitch range, and higher energy, aligning with previous findings in other languages.
Original languageEnglish
Title of host publicationSpeech and Computer
Subtitle of host publication26th International Conference, SPECOM 2024, Belgrade, Serbia, November 25–28, 2024, Proceedings, Part I
Pages265-280
Number of pages16
DOIs
StatePublished - 2025
Event26th International Conference on Speech and Computer : Specom 2024 - University of Novi Sad, Белград, Serbia
Duration: 25 Nov 202428 Nov 2024
Conference number: 26
https://specom.nw.ru/2024/
https://specom2024.ftn.uns.ac.rs
https://specom2024.ftn.uns.ac.rs/

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15299 LNAI

Conference

Conference26th International Conference on Speech and Computer
Abbreviated titleSPECOM 2024
Country/TerritorySerbia
CityБелград
Period25/11/2428/11/24
Internet address

    Research areas

  • Acoustic Analysis, Expert Annotation, Keyphrase Extraction, Perception, Russian Language

ID: 126874264