Documents

DOI

  • Megha Roshan
  • Mukul Rawat
  • Karan Aryan
  • Elena Lyakso
  • Mary Mekala A.
  • Nersissona Ruban

Recognizing the real emotion of humans is considered the most essential task for any customer feedback or medical applications. There are many methods available to recognize the type of emotion from speech signal by extracting frequency, pitch, and other dominant features. These features are used to train various models to auto-detect various human emotions. We cannot completely rely on the features of speech signals to detect the emotion, for instance, a customer is angry but still, he is speaking at a low voice (frequency components) which will eventually lead to wrong predictions. Even a video-based emotion detection system can be fooled by false facial expressions for various emotions. To rectify this issue, we need to make a parallel model that will train on textual data and make predictions based on the words present in the text. The model will then classify the type of emotions using more comprehensive information, thus making it a more robust model. To address this issue, we have tested four text-based classification models to classify the emotions of a customer. We examined the text-based models and compared their results which showed that the modified Encoder decoder model with attention mechanism trained on textual data achieved an accuracy of 93.5%. This research highlights the pressing need for more robust emotion recognition systems and underscores the potential of transfer models with attention mechanisms to significantly improve feedback management processes and the medical applications.

Original languageEnglish
Article numbere0301336
Number of pages21
JournalPLoS ONE
Volume19
Issue number4
DOIs
StatePublished - 16 Apr 2024

    Research areas

  • Male, Humans, Emotions, Speech, Voice, Linguistics, Recognition, Psychology

    Scopus subject areas

  • Computer Science(all)

ID: 119154629