This paper is dedicated to automatic language identification using deep neural networks. This task is formulated as the task of classification where input data is audio with speaking voice and the output is the detected language. It was proposed a neural network algorithm for this problem. An experiment was conducted using this algorithm on Common Voice dataset. The results have shown that deep learning architectures can perform relatively well on this task. The convolutional neural network has demonstrated the best ability to distinguish language-specific features and it is the most robust to overfitting.