This repository is prepared for the INT3209E_1 Data Mining course seminar ๐
๐ Libraries used:
pandasโ data manipulationnumpyโ numerical computationsmatplotlib/seabornโ visualizationstatsmodelsโ AR, MA, ARMA, ARIMA modelingscikit-learnโ performance metrics, preprocessing, PCApmdarimaโ auto ARIMA modeling
This repository contains basic time series forecasting projects using PCA dimensionality reduction combined with ARIMA models in Python.
The goal is to learn how to analyze, model, and forecast time series data in practice, with clean, reproducible code.
We cover both types of time series data:
-
Univariate Time Series: Only one variable is tracked over time.
Example: Monthly AirPassengers dataset. -
Multivariate Time Series: Multiple variables are tracked over time, which may influence each other.
Example: Daily Climate dataset.
- Load and preprocess time series datasets (AirPassengers, Bitcoin, etc.)
- Visualize trends, seasonality, and residuals
- StandardScaler for data normalization
- PCA (Principal Component Analysis) for dimensionality reduction
- Latent variable extraction and analysis
- Build models:
- AR (Autoregressive)
- MA (Moving Average)
- ARMA (Autoregressive Moving Average)
- ARIMA (Autoregressive Integrated Moving Average)
- Evaluate model performance:
- MSE (Mean Squared Error)
- MAE (Mean Absolute Error)
- Rยฒ Score
- Plot forecasts and confidence intervals