Air Quality Prediction - ML/DL Comparison
Master's thesis comparing machine learning and deep learning techniques for air quality forecasting using SARIMAX, DeepAR, LSTM, and Neural Prophet.
Overview
Master's degree thesis project for Artificial Intelligence at Universidad Internacional de La Rioja (UNIR). Comparative study of machine learning and deep learning models to predict ozone concentration using the London Average Air Quality Levels (LAAQL) dataset from King's College London.
Technologies & Tools
Key Features
- 1Comparative analysis of 4 forecasting algorithms: SARIMAX, DeepAR, LSTM, and Neural Prophet
- 2Time series prediction of ground-level ozone (O3) concentration
- 3Performance evaluation using MSE, MAE, RMSE, and MAPE metrics
- 4Data preprocessing and feature engineering for air quality data
- 5Visualization of prediction results and model comparison
- 6Statistical analysis of model performance across different time horizons
Challenges & Solutions
Model Selection and Comparison
Challenge: Needed to identify the most suitable algorithms for time series forecasting of air quality data and establish fair comparison criteria.
Solution: Selected 4 distinct approaches (statistical: SARIMAX, probabilistic: DeepAR, deep learning: LSTM and Neural Prophet) and evaluated them using standardized regression metrics (MSE, MAE, RMSE, MAPE) on the same dataset.
Time Series Data Preprocessing
Challenge: Air quality data contained missing values, outliers, and required proper temporal feature engineering for accurate predictions.
Solution: Implemented comprehensive data cleaning pipeline with interpolation for missing values, outlier detection, and temporal feature extraction. Created proper train/validation/test splits respecting temporal order.
Project Information
Timeline
Sep 2022 - Mar 2023
Role
Researcher / Developer
Project Metrics
3.29%
Best RMSE
9.55%
Best MAPE
4
Models Compared