A hybrid Machine Learning approach for air quality prediction in Morocco: combining CatBoost with metaheuristic optimization algorithms

  • Rachid ED-DAOUDI Laboratory of Research in Informatics, Data Sciences and Artificial Intelligence, School of Information Sciences, B.P. 604, Rabat-Instituts, Rabat, Morocco
  • Sokaina EL KHAMLICHI Laboratory of Research in Informatics, Data Sciences and Artificial Intelligence, School of Information Sciences, B.P. 604, Rabat-Instituts, Rabat, Morocco; Research Team in Science and Technology, Higher School of Technology of Laayoune, Ibn Zohr University, P.O. Box 3007, Laayoune, Morocco
  • Badia ETTAKI Laboratory of Research in Informatics, Data Sciences and Artificial Intelligence, School of Information Sciences, B.P. 604, Rabat-Instituts, Rabat, Morocco
Keywords: Air Pollution, Hybrid Machine Learning, CatBoost Algorithm, Metaheuristic Optimization, Air Quality Index

Abstract

Air pollution poses serious risks to public health and environmental sustainability, particularly in rapidlyurbanizing areas of developing countries. This study investigates whether combining machine learning algorithms with metaheuristic optimization techniques can improve the accuracy and efficiency of air quality prediction in Morocco. The main objective is to compare direct classification of Air Quality Index (AQI) categories with a regression-based approach, and to evaluate the effectiveness of two optimization strategies—Arithmetic Optimization Algorithm (AOA) and Hunger Games Search (HGS)—in tuning the CatBoost model’s hyperparameters. Using five months of air quality data from two monitoring stations in Ait Melloul, we modeled concentrations of PM2.5, PM10, CO, and derived corresponding AQI classifications. The hybrid approach demonstrated that regression-based classification improved accuracy by nearly 30 percentage points over direct classification. Moreover, HGS achieved similar predictive performance to AOA but was over twice as computationally efficient. CO concentration predictions in residential areas achieved high accuracy (R2 > 0.95),while particulate matter predictions revealed limitations in capturing extreme pollution events. These findings suggest that combining gradient boosting with metaheuristic optimization is a promising strategy for developing scalable and accurate air quality forecasting systems in North African urban environments, with important implications for public health protection and environmental policy implementation.
Published
2025-08-20
How to Cite
ED-DAOUDI, R., EL KHAMLICHI, S., & ETTAKI, B. (2025). A hybrid Machine Learning approach for air quality prediction in Morocco: combining CatBoost with metaheuristic optimization algorithms. Statistics, Optimization & Information Computing, 14(5), 2445-2471. https://doi.org/10.19139/soic-2310-5070-2705
Section
Research Articles