Skip to content

A machine learning project that uses Logistic Regression to classify emails as spam or not spam based on their content and metadata. The model is trained on labeled email data using text preprocessing techniques and converts text into numerical features to accurately detect unwanted messages.

Notifications You must be signed in to change notification settings

muqadasejaz/Email-Spam-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“§ Email Spam Classifier

A machine learning-based web application that classifies emails as Spam or Not Spam using Natural Language Processing (NLP) and Logistic Regression. This project uses Scikit-learn, TensorFlow, Flask, and NLTK, and is built with a simple yet functional user interface.


πŸ“ Overview

Email spam is a major issue in digital communication, often leading to security risks and lost productivity. This project provides an effective solution using machine learning and NLP to classify emails as spam or not spam.

Using a Kaggle dataset, the model is trained with Logistic Regression after applying text preprocessing techniques like stopword removal and lemmatization via NLTK. Text is vectorized using CountVectorizer, and the model achieves a strong 95% accuracy.

A simple Flask web app allows users to input email text, view predictions, and see model performance in real-time. This project demonstrates a complete ML workflow from training to deployment.


πŸš€ Features

  • Logistic Regression-based spam classification

  • NLP preprocessing using NLTK

  • Text vectorization using CountVectorizer

  • 95% model accuracy on test data

  • Interactive web interface using Flask

  • Displays both prediction result and original input

  • Terminal-based prediction loop for new entries

  • Clean and modular code structure


πŸ“Š Dataset


πŸ§ͺ Results

  • Model Used: Logistic Regression

  • Vectorizer: CountVectorizer

  • Accuracy: 95%

  • Evaluation Metrics:

    • Confusion Matrix

      image

    • Classification Report

      image

    • Prediction on new emails:

      image


πŸ› οΈ Tools & Technologies

Category Tools/Technologies
Libraries Scikit-learn, TensorFlow, NLTK, Flask
Techniques NLP, Logistic Regression, CountVectorizer
Tools Jupyter Notebook, VS Code
Language Python
Deployment Flask App (Localhost)

πŸ“š References


πŸ‘©β€πŸ’» Author

Muqadas Ejaz

BS Computer Science (AI Specialization)

Machine Learning & Computer Vision Enthusiast

πŸ“« Connect with me on LinkedIn

🌐 GitHub: github.com/muqadasejaz


About

A machine learning project that uses Logistic Regression to classify emails as spam or not spam based on their content and metadata. The model is trained on labeled email data using text preprocessing techniques and converts text into numerical features to accurately detect unwanted messages.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published