This repository contains a collection of my data science and engineering projects. Below are brief descriptions of each key project featured in this repository.
- Binomial Sentiment Classification Model
- Mixture Gamma Distribution
- Time Series Forecasting
- Airflow DAG generator
- PlayMarket Review Scrapper
This project focuses on classifying user comments left for the EGOV mobile app into positive or negative sentiments. A Naive Bayes classifier is utilized for this task due to its simplicity and effectiveness in text classification problems. The model is trained on a labeled dataset of comments and then evaluated for its accuracy in predicting the sentiment of unseen data.
As part of a research project, this code explores the performance of the Chi-Square goodness of fit test for a mixture of gamma distributions with varying parameters and proportions. The aim is to evaluate how well the Chi-Square test can detect deviations from the expected distribution when dealing with complex mixtures.
This project involves developing models to forecast future values based on previously observed values in time series data. I use various techniques such as ARIMA, LSTM networks, and other machine learning approaches tailored to handle the sequential nature of time data. The goal is to provide accurate and actionable forecasts that can be used in decision-making processes.
This project automates the creation of basic Apache Airflow DAGs using configuration details from an Excel file, leveraging Jinja2 templates to dynamically generate DAG files for task orchestration.
A web-based Flask application that scrapes user reviews from any Google Play app using its URL. It displays the reviews in a table format and provides an option to download them as a CSV file for further analysis.