AUTOMATIC KAZAKH SPEECH RECOGNITION WITH DNN
Abstract
This paper describes one of the areas in the field of artificial intelligence speech recognition systems. Comparing the speeches o f Kazakh and other languages, they identified the main problems of automatic recognition of this language. One of the main problems is the lack of speech data, for which work was carried out to collect acoustic data of the Kazakh language. In order to continue the research work related to the Kazakh language, the personal data of the announcers were identified. Algorithms for processing speech signals, learning acoustic and language modeling are described and research and practical work is carried out. Test results of speech recognition using deep neural networks were obtained. Comparisons with the results of traditional models and the best DNN (Deep Neural Network) aspects.
About the Authors
O. MamyrbayevKazakhstan
M. Turdalyuly
Kazakhstan
N. Mekebayev
Kazakhstan
T. Turdalykyzy
Kazakhstan
A. Shayakhmetova
Kazakhstan
References
1. Stouten F., Duchateau J., Martens J.-P., Wambacq P. Coping with disfluencies in spontaneous speech recognition: acoustic detection and linguistic context manipulation // Speech Communication. 2006. Vol. 48. pp. 1590-1606.
2. Tsiaras V., Panagiotakis C., Stylianou Y. Video and audio based detection of filled hesitation pauses in classroom lectures // Proc. o f the 17th European Signal Processing Conference (EUSIPCO 2009). Glasgow, Scotland, August 24-28, 2009. pp. 834-838.
3. Psutka J., Ircing P., Psutka J. V., Hajic J., Byrne W. J., Mirovsky J. Automatic Transcription of Czech, Russian, and Slovak Spontaneous Speech in the M ALACH Project // Proceedings of Eurospeech. Lisboa. Portugal. Sept. 4-8. 2005. pp. 1349-1352.
4. Young S. et al. The HTK Book (for HTK Version 3.4). Cambridge. UK, 2009. 375 p.
5. Karpov A., Kipyatkova I., Ronzhin A. Very Large Vocabulary A SR for Spoken Russian with Syntactic and Morphemic Analysis. In Proc. INTERSPEECH-2011, Florence, Italy, 2011, pp. 3161-3164.
6. Serizel, R., Giuliani, D.: Vocal tract length normalization approaches to DNN-Based children’s and adults’ speech recognition. IEEE W orkshop on Spoken Language Technology, pp. 135-140. 2014.
7. Behbahani, Yasser Mohseni, Babaali, Bagher, Turdalyuly Mussa Persian sentences to phoneme sequences conversion based on recurrent neural networks // Open Computer Science. - 2016. - Issue-6. - P. 219-225.
8. Dong Yu., Li Deng Automatic Speech Recognition // Shpringer. -2014. P. -315.
Review
For citations:
Mamyrbayev O., Turdalyuly M., Mekebayev N., Turdalykyzy T., Shayakhmetova A. AUTOMATIC KAZAKH SPEECH RECOGNITION WITH DNN. Herald of the Kazakh-British Technical University. 2019;16(2):134-142. (In Russ.)