The paper presents an analysis of acoustic models for speech recognition trained on limited data. Two types of training material were employed: phonetically balanced texts and arbitrarily chosen fragments of read speech, each recording being about 3 minutes long. Test fragments consisted of recordings of both spontaneous and read speech. It is found that the main factor influencing the performance of the model is the number of sounds in the training material.
