Used the Enron email dataset to represent a real life like scenario. Classified emails by understanding the content of the email sent. Did feature extraction using Google's NLP model BERT which gave us feature vectors. Used the feature vectors as input for a standard artificial neural network which did the classification. For the classification task, compared various machine learning models like Linear Support Vector Machine, Random Forest, SGD Classifier and LSTM. For the machine learning approaches, tried various embeddings like TF-IDF and CountVectorizer. Did topic modelling with Latent Dirichlet Allocation to find the major topics of discussion in the dataset.
forked from shaival2905/email-classification-using-BERT
-
Notifications
You must be signed in to change notification settings - Fork 0
krupalshah6996/email-classification-using-BERT
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Classification of various user's emails based on writing patterns and extraction of topics from text corpus using Enron dataset
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Jupyter Notebook 87.3%
- Python 12.7%