Preview

Herald of the Kazakh-British Technical University

Advanced search

METHODS OF PROCESSING AND ANALYZING BIG DATA IN MACHINE LEARNING TASKS: APPROACHES AND PROSPECTS

https://doi.org/10.55452/1998-6688-2025-22-1-25-35

Abstract

This article explores the methods of processing and analyzing big data in order to improve the accuracy and efficiency of machine learning (MO) models. The main focus is on classification problems, the effectiveness of algorithms such as XGBoost, the support vector machine (SVM), ensemble methods, as well as systems for working with big data, including Hadoop and Apache Spark. The key stages of working with data are described: cleaning, normalization, selection of features, which is critically important for building stable models on large amounts of data. Accuracy, completeness, F-measure, and AUC-ROC metrics were used to evaluate the effectiveness of the algorithms, which made it possible to conduct a comparative analysis and identify the most productive approaches. Special attention is paid to the application of MO in the context of organizational innovations, including the tasks of classification, forecasting the success of innovations and innovation portfolio management. Recommendations on the choice of technologies and algorithms for various data types and scales are presented, and prospects for integrating distributed computing platforms with MO algorithms to achieve scalable and efficient solutions are discussed.

About the Authors

N. Komarov
International University of Information Technologies
Kazakhstan

 Master’s student 

 Almaty 



S. B. Mukhanov
International University of Information Technologies
Kazakhstan

 PhD, assistant professor 

 Almaty 



I. M. Bazarbekov
International University of Information Technologies
Kazakhstan

 Master, senior-lecturer 

 Almaty 



S. Zh. Zhakypbekov
International University of Information Technologies
Kazakhstan

 Master, senior-lecturer 

 Almaty 



S. Y. Sibanbayeva
Almaty Management University
Kazakhstan

 PhD, Assistant Professor 

 Almaty 



References

1. Junfei Qiu, Qihui Wu, Guoru Ding, Yuhua Xu & Shuo Feng. A survey of machine learning for big data processing. EURASIP Journal on Advances in Signal Processing, 2016, vol. 2016, article no. 67.

2. Najafabadi M.M., Villanustre F., Khoshgoftaar T.M. et al. Deep learning applications and challenges in big data analytics. Journal of Big Data, 2015, no. 2, article no. 1. https://doi.org/10.1186/s40537-014-0007-7.

3. Järvinen P., Siltanen P., Kirschenbaum A. Data Analytics and Machine Learning. In: Södergård C., Mildorf T., Habyarimana E., Berre A.J., Fernandes J.A., Zinke-Wehlmann C. (eds) Big Data in Bioeconomy. Springer, Cham. 2021. https://doi.org/10.1007/978-3-030-71069-9_10.

4. Sarker I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN COMPUT. SCI., 2021, vol. 2, article no. 160. https://doi.org/10.1007/s42979-021-00592-x.

5. Dastan Hussen Maulud, Adnan Mohsin Abdulazeez. A Review on Linear Regression Comprehensive in Machine Learning, Journal of Applied Science and Technology Trends, 2020, vol. 1, no. 4, pp. 140–147. https://doi.org/10.38094/jastt1457.

6. Maher Maalouf. Logistic regression in data analysis: An overview.International Journal of Data Analysis, Techniques and Strategies, 2011, vol. 3, no. 3, pp. 281–299, https://doi.org/10.1504/IJDATS.2011.041335.

7. Ivanov A.A. Iskusstvennyj intellekt kak osnova innovacionnyh preobrazovanij v tehnike, jekonomike, biznese. Izvestija SPbGJeU, 2018, no. 3 (111), pp. 112–115. [in Russian]

8. Sejdametova Z.S. Jekonomika i mashinnoe obuchenie. Uchenye zapiski Krymskogo inzhenernopedagogicheskogo universiteta, 2019, no. 1(63), pp. 167–171. [in Russian]

9. Terehov V.I. Metodika podgotovki dannyh dlja obrabotki impul'snymi nejronnymi setjami. Nejrokomp'jutery: razrabotka, primenenie, 2017, no. 2, pp. 31–36. [in Russian]

10. Flah P. Mashinnoe obuchenie. Nauka i iskusstvo postroenija algoritmov, kotorye izvlekajut znanija iz dannyh, 2015, p. 400. [in Russian]

11. Mukhanov S.B., Uskenbayeva R.K. Pattern Recognition with Using Effective Algorithms and Methods of Computer Vision Library. Advances in Intelligent Systems and Computing, 2020, no. 1, pp. 31–37.

12. Mukhanov S., Uskenbayeva R., Im Cho Young, Dauren K., Les N., Amangeldi M. Gesture Recognition of Machine Learning and Convolutional Neural Network Methods for Kazakh Sign Language. Herald Scientific Journal of Astana IT University, 2023, vol. 15, pp. 16–27.

13. Mukhanov S.B., Lee A.S., Zheksenov D.B., Yevdokimov D.D., Amirgaliev E.N., Kalzhigitov N.K., Kenshimov Sh. Comparative analysis of neural network models for gesture recognition methods hands. Bulletin of NIA RK. Information and communication technologies, 2023, no. 2(88), pp. 15–27.

14. Kenshimov C., Mukhanov S., Merembayev T., Yedilkhan D. A Comparison of Convolutional Neural Networks for Kazakh Sign Language Recognition Eastern-European. Journal of Enterprise Technologies, 2021, vol. 5, no. 2–113, pp. 44–54.


Review

For citations:


Komarov N., Mukhanov S.B., Bazarbekov I.M., Zhakypbekov S.Zh., Sibanbayeva S.Y. METHODS OF PROCESSING AND ANALYZING BIG DATA IN MACHINE LEARNING TASKS: APPROACHES AND PROSPECTS. Herald of the Kazakh-British Technical University. 2025;22(1):25-35. https://doi.org/10.55452/1998-6688-2025-22-1-25-35

Views: 245


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1998-6688 (Print)
ISSN 2959-8109 (Online)