Machine learning scripts in Python Jupyter notebook
Project I-1 Boston_housing:
The project uses supervised learning algorithms to train models with Boston Housing dataset in order to make prediction of prices for clients who intend to purchase houses. Dataset: Boston Housing Prices and Features (e.g. numbers of rooms, student-to-teacher ratios, etc.) from Kaggle Project steps brief summary:
- Load Dataset
- Training and testing data splitting
- Train model using DecisionTreeRegressor
- Evaluate the model with GridSearchCV cross validation
- Make prediction
- Justify if more features should be included.
Project I-2 Find_donors:
Project steps:
- Evaluate three classifiers (e.g. SVC, Statistical Gradient Descent classifier, essemble methods-adaboost, logistic regression, etc.) by training the models with training data and computing accuracy_score and f0.5_score for model performance on training data and testing data.
- The best model is selected based on computing time, and model performance (accuracy_score, f_score - β=0.5).
- After best model selected, GridSearchCV is used for hyperparameter tuning and model optimization.
Project I-3 Create_customer_segments:
Project brief summary:
- Unsupervised learning problem (e.g. hierarchical clustering, Gaussian Mixture Models, complete- and average-link clustering, etc.)
- Project steps:
- Data Preprocessing: log transform, outlier detection (points falling out of 1.5 IQR and reasonably remove a few)
- PCA: dimensionality reduction.
- Clustering: K-Means v.s. GaussianMixture Model; use of Silhouette score as evaluation metrics.
- Visualizaton, Prediction: pca.inverse_transform for centers of clusters, predict selected samples and compare with raw data.
Project III: Deep Learning Practice Projects
Project IV: Deep Learning Specialization