TIME SERIES-BASED APPROACHES FOR IMPROVING WIND POWER GENERATION FORECAST ACCURACY
https://doi.org/10.55452/1998-6688-2023-20-2-103-114
Abstract
This study provides a detailed analysis and prediction of power generation at wind farms in Germany using Lasso, LightGBM, and CatBoost machine learning models. Feature Engineering was used on the data, which allowed the extraction of more detailed data, which was used to improve the quality of the models. Through Extensive Data Analysis (EDA), the authors identify and develop lagged and moving features from the energy production time series, under the assumption that accurate predictions can significantly improve the stability of energy systems, especially in the context of increasing dependence on renewable energy sources. The performance of each model is evaluated based on the Mean Absolute Error(MAE), Mean Squared Error(MSE), and Root Mean Squared Error(RMSE) metrics, with CatBoost exhibiting the highest accuracy. In conclude, pointing to opportunities for further research aimed at optimizing these models and adapting them to other regions, emphasizing the comprehensive and long-term potential of this study in the context of energy field.
About the Authors
Ye. N. KnaytovKazakhstan
Knaytov Yernar Nurlanuly, Master student
59, Tole bi street, Almaty, 050000
A. Zh. Akzhalova
Kazakhstan
Akzhalova Assel Zholdasovna, Dr., Head of International project groups, Coordinator of SDG center, Professor of IT Faculty, PhD in Math.Modelling (RK), PhD in Computer Science (King's College London,UK)
59, Tole bi street, Almaty, 050000
Ben Yahia Sadok
Estonia
Sadok Ben Yahia, Professor
Tallinn
References
1. Tibshirani R. (1996) Regression Shrinkage and Selection via the Lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58 (1), pp. 267–288.
2. Ke G., Meng Q., Finley T., Wang T., Chen W., Ma W., Ye Q. & Liu T. (2017) LightGBM: A Highly Efficient Gradient Boosting Decision Tree, Advances in Neural Information Processing Systems, 30, pp. 3146–3154.
3. Prokhorenkova L., Gusev G., Vorobev A., Dorogush A. & Gulin A. (2018) CatBoost: unbiased boosting with categorical features, Advances in Neural Information Processing Systems, 31, pp. 6638–6648.
4. Pereira F., Portela F. & Neves J. (2021) Fraud detection in digital payments: A CatBoost approach, Journal of Computational Science, 53, 101344.
5. Ye M., Li X., Shao M., Li Y. & Liu X. (2021) An ensemble machine learning framework for custom.
6. García S., Luengo J., Herrera F. (2015) Data Preprocessing in Data Mining. Springer.
7. Wang L., Li L. & Khedr A.M. (2019) Feature Engineering and Selection for Time Series Forecasting: A Review, ACM Computing Surveys, 52(5), pp.1–37.
8. Godahewa R. & Samaraweera L. (2019) Time Series Feature Engineering: A Systematic Review, In International Conference on Advances in Computing and Data Sciences, pp. 571–581.
9. Hyndman R.J. & Athanasopoulos G. (2018) Forecasting: Principles and Practice. OTexts.
10. Chai T. & Draxler R.R. (2014) Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature, Geoscientific Model Development, 7(3), pp.1247–1250.
11. Aldrin M. & Holden M. (2019) On the use of RMSE and MAE in model evaluation, Ocean Dynamics, 69(7), pp. 925–933.
12. VanderPlas J.T. (2016) Python Data Science Handbook: Essential Tools for Working with Data, O'Reilly Media.
13. Van den Bossche J. (2017) Interactive Data Visualization in Python With Plotly, Journal of Open Source Education, 20(2), 43.
14. Suresh H.P. & Mohan C.K. (2018) A Comprehensive Review of Predictive Data Mining Techniques for Credit Scoring in Banking Sector, Journal of King Saud University-Computer and Information Sciences, 30(3), pp. 360–374.
Review
For citations:
Knaytov Ye.N., Akzhalova A.Zh., Sadok B.Ya. TIME SERIES-BASED APPROACHES FOR IMPROVING WIND POWER GENERATION FORECAST ACCURACY. Herald of the Kazakh-British Technical University. 2023;20(2):103-114. https://doi.org/10.55452/1998-6688-2023-20-2-103-114