Skip to content

hemanth-007/Email-Spam-Filtering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Email-Spam-Filtering

Source of Dataset: http://spamassassin.apache.org/old/publiccorpus

• Built a classification model to classify an email as either spam or ham using the Naive Bayes algorithm

• Used Beautiful Soup, re library and email parser to extract plain text from an email and performed stemming

• Implemented a CountVectorizer and TfidfTransformer pipeline from scratch to transform emails into a sparse matrix of TF-IDF features

• Evaluated the model using the Cross-Validation technique and achieved an accuracy of 97.84% and a recall score of 91.3%

• Tools used: Scikit-learn, email parser, Beautiful Soup, re, nltk, scipy, Jupyter-Notebook

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published