This project focuses on building predictive models to estimate the success of Airbnb property listings, leveraging Logistic Regression and Random Forest algorithms. The aim is to determine whether a given property listing will be successful based on various features extracted and engineered from the Airbnb dataset.
To accurately predict the success likelihood of Airbnb property listings by employing machine learning techniques and insightful data analysis.
The dataset utilized for this project is extensive and includes numerous listings with detailed features. It can be downloaded from:
(Alternatively, a direct link to the processed dataset will soon be provided.)
The project follows these structured steps:
-
Exploratory Data Analysis (EDA)
- Data cleaning and initial analysis to understand distributions, trends, and relationships between variables.
-
Data Visualization
- Heatmaps to visualize correlations between various features.
-
Manual Feature Selection
- Utilizing domain knowledge to select relevant and impactful features.
-
Handling Missing Values
- General imputation methods to handle incomplete data.
-
Advanced Imputation (KNN)
- K-Nearest Neighbors algorithm used specifically for imputing critical missing values in
review_scores_ratingandreviews_per_month.
- K-Nearest Neighbors algorithm used specifically for imputing critical missing values in
-
Data Preparation for Modeling
- Encoding categorical features and scaling numeric features to enhance model performance.
-
Feature Engineering
- Creating additional meaningful features derived from the dataset to boost predictive accuracy.
-
Logistic Regression Model
- Application of Logistic Regression to classify listings as successful or unsuccessful.
-
Random Forest Model
- Employing a Random Forest classifier for improved accuracy and handling complex feature interactions.
-
Model Comparison
- Evaluating and comparing both models to determine the best performing predictive approach.
- Python
- Pandas
- NumPy
- Matplotlib
- Seaborn
- Scikit-learn
git clone https://github.com/azizzoaib786/airbnb-dataset-prediction-multiple-algo.git
cd airbnb-dataset-prediction-multiple-algopip install -r requirements.txtjupyter notebookContributions, issues, and feature requests are welcome! Feel free to check issues page.
This project is licensed under the MIT License β see the LICENSE file for details.
For questions or suggestions, please reach out via:
- Email: [email protected]