Clustering - Countries by Socio-economic and Health Indicators (Unsupervised Learning)

Overview

This project applies unsupervised clustering techniques to categorize countries based on socio-economic and health indicators, facilitating analysis of their overall development levels.

Project Goals

Classify countries into meaningful clusters using unsupervised learning.
Visualize and analyze patterns across socio-economic and health-related data.
Utilize dimensionality reduction and clustering techniques for accurate grouping.

Steps Included

1. Exploratory Data Analysis (EDA)

Initial data exploration
Data cleaning and preprocessing

2. Data Visualization

Geographical Maps
Heatmaps
Histograms for understanding feature distributions

3. Data Scaling

Normalization to ensure equitable influence of all features

4. Feature Engineering

Principal Component Analysis (PCA) for dimensionality reduction and feature extraction

5. Clustering Analysis

Elbow Method for optimal cluster determination
Silhouette Score Analysis for validating cluster quality

6. K-Means Clustering

Application of the K-Means algorithm (unsupervised learning) to segment countries

How to Run

To run this project locally:

Clone the repository:

git clone https://github.com/azizzoaib786/countries-dataset-clustering.git

Install the required dependencies:
```
pip install -r requirements.txt
```
Run the notebook:
- Open countries-dataset-clustering.ipynb in Jupyter Notebook or JupyterLab.
- Execute all cells (Run All).

Requirements

Python (3.7 or higher recommended)
pandas
numpy
matplotlib
seaborn
scikit-learn
geopandas (for map visualizations)
Jupyter Notebook/JupyterLab

Contributions

Contributions are encouraged! Fork the repository, make improvements, and submit a pull request.

Contact

Email: [email protected]

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
countries-dataset-clustering.ipynb		countries-dataset-clustering.ipynb
country-data.csv		country-data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Clustering - Countries by Socio-economic and Health Indicators (Unsupervised Learning)

Overview

Project Goals

Steps Included

1. Exploratory Data Analysis (EDA)

2. Data Visualization

3. Data Scaling

4. Feature Engineering

5. Clustering Analysis

6. K-Means Clustering

How to Run

Requirements

Contributions

Contact

License

About

Uh oh!

Releases

Packages

Languages

azizzoaib786/countries-dataset-clustering

Folders and files

Latest commit

History

Repository files navigation

Clustering - Countries by Socio-economic and Health Indicators (Unsupervised Learning)

Overview

Project Goals

Steps Included

1. Exploratory Data Analysis (EDA)

2. Data Visualization

3. Data Scaling

4. Feature Engineering

5. Clustering Analysis

6. K-Means Clustering

How to Run

Requirements

Contributions

Contact

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages