OPTIMIZATION OF DATA PREPARATION ALGORITHMS FOR TRAINING AND APPLICATION OF MACHINE LEARNING MODELS IN PYTHON
Abstract
In this paper, the feasibility of using various data preparation algorithms for better training of the model in the python3 programming language is considered. We describe how to interact with missing values in the data set and how to eliminate them, depending on various factors. Algorithms for converting nominal variables into a form suitable for teaching models of the Scikit-Learn library are considered. Also, a method of combining data conversion algorithms to achieve the highest predictive ability in F1 measure was applied using the example of a binary classification model.
About the Authors
M. JilikbaevKazakhstan
Akzhalova Akzhalova
Kazakhstan
References
1. D. Powers, "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation,'' J. Mach. Learn. Res., 2, No.1, 37--63 (2011).
2. D. L. Olson and D. Delen, Advanced Data Mining Techniques, Springer, New York (2008).
3. S. E. Whang, "Goods: Organizing google’s datasets,'' SIGMOD, 795–-806 (2016).
4. L. Chen and A. Kumar, "Enabling and optimizing nonlinear feature interactions in factorized linear algebra,'' SIGMOD, 1571–-1588 (2019).
5. L. Chen, A. Kumar, J. F. Naughton and J. M. Patel, "Towards linear algebra over normalized data,'' PVLDB, 10, No.11, 1214–-1225 (2017).
6. I. Czogiel, K. Luebke and C. Weihs, "Response surface methodology for optimizing hyper parameters,'' Technical report, Universitat Dortmund Fachbereich Statistik (2005).
7. I. D. Erhan, Y. Bengio, A. Courville, P. Manzagol, P. Vincent and S. Bengio, ``Why does unsupervised pre-training help deep learning?'' Journal of Machine Learning Research, 625–-660 (2010).
8. G. E. Hinton, "A practical guide to training restricted Boltzmann machines,'' Technical Report, University of Toronto 1 (2010).
Review
For citations:
Jilikbaev M., Akzhalova A. OPTIMIZATION OF DATA PREPARATION ALGORITHMS FOR TRAINING AND APPLICATION OF MACHINE LEARNING MODELS IN PYTHON. Herald of the Kazakh-British technical university. 2020;17(3):131-136. (In Russ.)