To study of the emotional state reflection in the voice and speech of 6-12 years old children with autism spectrum disorders (ASD), Down syndrome (DS) and typical development (TD), the automatic classification of children’s emotional speech on the states “comfort – neutral – discomfort” were conducted. Child speech was recorded in model situations a dialogue with the experimenter and playing with a standard set of toys and annotated by three experts. Automatic classification of children’s speech on three states was performed using
automatically extracted sets of acoustic features GeMAPS and extended eGeMAPS. As classifiers, we used classifiers based on Gaussian Mixture Models (GMM) and Support Vector Machine (SVM). The state of discomfort is classified better for children with ASD (0.523; 0.305 – precision and recall) and DS (0.504; 0.564); the state of comfort – for TD children (0.546; 0.241). The minimal GeMAPS feature set gives better results (accuracy - 0.687, 0.725, 0.641 – for ASD, DS, TD children) than the extended eGeMAPS feature set
(0.671, 0.717, 0.631), that indicates the importance of low-level features. A comparison with the data of an auditory perceptual experiment (100 listeners) was made. Listeners better recognized discomfort in children with ASD and DS (78%) and comfort state in TD children (58%).