Comparison of Machine Learning Algorithms for Time Series Forecasting of CO in the City of Cartagena

Resumen

Air pollution has become an increasingly urgent global concern, with significant implications for public health and environmental sustainability. This paper investigates the problem of predicting Carbon Monoxide (CO) concentrations through time series analysis, using data gathered by urban sensors in Cartagena as a case study. A comprehensive set of Machine Learning and statistical approaches is evaluated, leveraging the sktime Python library and Optuna for hyperparameter optimization. We assess classical time series models (ARIMA, ETS, etc.), regression-based approaches (k-nearest neighbors), and deep learning architectures (CNNRegressor), and also explore how different training window sizes (ranging from one week to several months) affect forecasting accuracy and runtime. Multiple metrics, including SMAPE, RMSE, MAE, and R2, are used to compare forecast accuracy, and execution times are also reported. Results show that certain relatively simple models, such as ETS or ARIMA, can achieve robust performance across various sensors, while a k-NN reduction approach offers an appealing trade-off between speed and accuracy. These findings emphasize the potential of adequately tuned algorithms for short-term CO forecasting in urban environments, supporting proactive air quality management.

Publicación
Intelligent Systems and Applications, PP. 366–382