Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic

Standard

Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic. / Lyakso, Elena ; Frolova, Olga; Matveev, Anton; Matveev, Yuri; Grigorev, Aleksey; Makhnytkina, Olesya; Ruban, Nersisson.

Speech and Computer - 24th International Conference, SPECOM 2022, Proceedings: 24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings. ed. / S.R. Mahadeva Prasanna; Alexey Karpov; K. Samudravijaya; Shyam S. Agrawal. Springer Nature, 2022. p. 438-450 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13721 LNAI).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review

Harvard

Lyakso, E , Frolova, O, Matveev, A, Matveev, Y, Grigorev, A, Makhnytkina, O & Ruban, N 2022, Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic. in SRM Prasanna, A Karpov, K Samudravijaya & SS Agrawal (eds), Speech and Computer - 24th International Conference, SPECOM 2022, Proceedings: 24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13721 LNAI, Springer Nature, pp. 438-450, 24th International Conference on Speech and Computer SPECOM 2022, Gurugram, India, 14/11/22. https://doi.org/10.1007/978-3-031-20980-2_38

APA

Lyakso, E., Frolova, O., Matveev, A., Matveev, Y., Grigorev, A., Makhnytkina, O., & Ruban, N. (2022). Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic. In S. R. M. Prasanna, A. Karpov, K. Samudravijaya, & S. S. Agrawal (Eds.), Speech and Computer - 24th International Conference, SPECOM 2022, Proceedings: 24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings (pp. 438-450). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13721 LNAI). Springer Nature. https://doi.org/10.1007/978-3-031-20980-2_38

Vancouver

Lyakso E , Frolova O, Matveev A, Matveev Y, Grigorev A, Makhnytkina O et al. Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic. In Prasanna SRM, Karpov A, Samudravijaya K, Agrawal SS, editors, Speech and Computer - 24th International Conference, SPECOM 2022, Proceedings: 24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings. Springer Nature. 2022. p. 438-450. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-031-20980-2_38

Author

Lyakso, Elena ; Frolova, Olga ; Matveev, Anton ; Matveev, Yuri ; Grigorev, Aleksey ; Makhnytkina, Olesya ; Ruban, Nersisson. / Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic. Speech and Computer - 24th International Conference, SPECOM 2022, Proceedings: 24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings. editor / S.R. Mahadeva Prasanna ; Alexey Karpov ; K. Samudravijaya ; Shyam S. Agrawal. Springer Nature, 2022. pp. 438-450 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

BibTeX

@inproceedings{becc570a962642de82296a0e4f58988f,

title = "Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic",

abstract = "The paper presents the results of perceptual experiments (by humans) and automatic recognition of the emotional states of children with Down syndrome (DS) by video, audio and text modalities. The participants of the study were 35 children with DS aged 5–16 years, and 30 adults – the participants of the perceptual experiment. Automatic analysis of facial expression by video was performed using FaceReader software runs on the Microsoft Azure cloud platform and convolutional neural network. Automatic recognition of the emotional states of children by speech was carried out using a recurrent neural network. Specifically for this project, we did not apply any additional transfer learning or fine-tuning as our goal was to investigate how the generic models perform for children with DS. The results of perceptual experiments showed that adults recognize the emotional states of children with DS by video better than by audio. Automatic classification of children{\textquoteright}s emotional states by facial expression revealed better results for joy and neutral states than for sadness and anger; by audio the best results were shown for the neutral state, by the texts of children{\textquoteright}s speech - for joy, the state of sadness was not recognized automatically. The study revealed the possibility of using the available software for classifying the neutral state and the state of joy, i.e. states with neutral and positive valence, and the need to develop an approach to determine the state of sadness and anger.",

keywords = "emotional state, Perceptual and automatic recognition, Child with down syndrome, Video, Audio, Text modalities, Emotional state",

author = "Elena Lyakso and Olga Frolova and Anton Matveev and Yuri Matveev and Aleksey Grigorev and Olesya Makhnytkina and Nersisson Ruban",

note = "Publisher Copyright: {\textcopyright} 2022, Springer Nature Switzerland AG.; null ; Conference date: 14-11-2022 Through 16-11-2022",

year = "2022",

month = nov,

doi = "10.1007/978-3-031-20980-2_38",

language = "English",

isbn = "978-3-031-20979-6",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Nature",

pages = "438--450",

editor = "Prasanna, {S.R. Mahadeva} and Alexey Karpov and K. Samudravijaya and Agrawal, {Shyam S.}",

booktitle = "Speech and Computer - 24th International Conference, SPECOM 2022, Proceedings",

address = "Germany",

url = "https://www.specom.co.in",

}

RIS

TY - GEN

T1 - Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic

AU - Lyakso, Elena

AU - Frolova, Olga

AU - Matveev, Anton

AU - Matveev, Yuri

AU - Grigorev, Aleksey

AU - Makhnytkina, Olesya

AU - Ruban, Nersisson

PY - 2022/11

Y1 - 2022/11

N2 - The paper presents the results of perceptual experiments (by humans) and automatic recognition of the emotional states of children with Down syndrome (DS) by video, audio and text modalities. The participants of the study were 35 children with DS aged 5–16 years, and 30 adults – the participants of the perceptual experiment. Automatic analysis of facial expression by video was performed using FaceReader software runs on the Microsoft Azure cloud platform and convolutional neural network. Automatic recognition of the emotional states of children by speech was carried out using a recurrent neural network. Specifically for this project, we did not apply any additional transfer learning or fine-tuning as our goal was to investigate how the generic models perform for children with DS. The results of perceptual experiments showed that adults recognize the emotional states of children with DS by video better than by audio. Automatic classification of children’s emotional states by facial expression revealed better results for joy and neutral states than for sadness and anger; by audio the best results were shown for the neutral state, by the texts of children’s speech - for joy, the state of sadness was not recognized automatically. The study revealed the possibility of using the available software for classifying the neutral state and the state of joy, i.e. states with neutral and positive valence, and the need to develop an approach to determine the state of sadness and anger.

AB - The paper presents the results of perceptual experiments (by humans) and automatic recognition of the emotional states of children with Down syndrome (DS) by video, audio and text modalities. The participants of the study were 35 children with DS aged 5–16 years, and 30 adults – the participants of the perceptual experiment. Automatic analysis of facial expression by video was performed using FaceReader software runs on the Microsoft Azure cloud platform and convolutional neural network. Automatic recognition of the emotional states of children by speech was carried out using a recurrent neural network. Specifically for this project, we did not apply any additional transfer learning or fine-tuning as our goal was to investigate how the generic models perform for children with DS. The results of perceptual experiments showed that adults recognize the emotional states of children with DS by video better than by audio. Automatic classification of children’s emotional states by facial expression revealed better results for joy and neutral states than for sadness and anger; by audio the best results were shown for the neutral state, by the texts of children’s speech - for joy, the state of sadness was not recognized automatically. The study revealed the possibility of using the available software for classifying the neutral state and the state of joy, i.e. states with neutral and positive valence, and the need to develop an approach to determine the state of sadness and anger.

KW - emotional state

KW - Perceptual and automatic recognition

KW - Child with down syndrome

KW - Video

KW - Audio

KW - Text modalities

KW - Emotional state

UR - https://www.mendeley.com/catalogue/47a4af70-3259-313c-9259-3a52f00a1ed6/

UR - http://www.scopus.com/inward/record.url?scp=85142767380&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-20980-2_38

DO - 10.1007/978-3-031-20980-2_38

M3 - Conference contribution

SN - 978-3-031-20979-6

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 438

EP - 450

BT - Speech and Computer - 24th International Conference, SPECOM 2022, Proceedings

A2 - Prasanna, S.R. Mahadeva

A2 - Karpov, Alexey

A2 - Samudravijaya, K.

A2 - Agrawal, Shyam S.

PB - Springer Nature

Y2 - 14 November 2022 through 16 November 2022

ER -

ID: 100304343