Movie Reviews Sentiment Analysis

Overview

This project implements sentiment analysis on movie reviews using both traditional Machine Learning and BERT-based approaches. The system can classify movie reviews as either positive (1) or negative (0) with high accuracy.

Project Structure

├── BERT accuracy/          # BERT model performance visualizations
├── ML models accuracy/     # ML models performance visualizations
├── NLP_Data/               # Dataset files
│   └── all_reviews.csv     # Combined dataset
├── NLP project.py          # Main project implementation
├── NLP project.ipynb       # Main project implementation as Jupyter Notebook
├── Documentation.pdf       # Detailed documentation
└── requirements.txt        # Project dependencies

Setup and Installation

Clone the repository:

git clone https://github.com/Mohammed2372/Movie-Reviews-Sentiment-Analysis.git
cd Movie-Reviews-Sentiment-Analysis

Create a virtual environment (recommended):

python -m venv venv
.\venv\Scripts\Activate

Install dependencies:

pip install -r requirements.txt

Download NLTK resources: The script will automatically download required NLTK resources on first run, or you can manually download them:

import nltk
nltk.download(['punkt', 'wordnet', 'stopwords', 'averaged_perceptron_tagger'])

Model Training

Run the main script to train both ML models and BERT:

python "NLP project.py"

Models

Traditional Machine Learning

Logistic Regression
Linear SVC
Random Forest All models use TF-IDF vectorization with unigrams and bigrams.

BERT Model

Base: textattack/bert-base-uncased-SST-2
Fine-tuned for sentiment analysis
Includes early stopping and model checkpointing

Performance

Machine Learning Models

Results available in ML models accuracy/
Classification reports for each model
Comparative performance analysis

BERT Model

Results available in BERT accuracy/
Training loss curves
Evaluation metrics
Final test results

Documentation

For detailed information about:

Data preprocessing steps
Model architectures
Training configurations
Performance metrics
Implementation details

Please refer to Documentation.pdf for more details.

Usage

To use the trained model for predictions (after training and saving the model):

from transformers import BertForSequenceClassification, BertTokenizer

# Load the model
model_path = "./bert_model"
model = BertForSequenceClassification.from_pretrained(model_path)
tokenizer = BertTokenizer.from_pretrained(model_path)

# Prepare text
text = "Your movie review here"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

# Get prediction
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=-1)
sentiment = "positive" if prediction == 1 else "negative"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Movie Reviews Sentiment Analysis

Overview

Project Structure

Setup and Installation

Model Training

Models

Traditional Machine Learning

BERT Model

Performance

Machine Learning Models

BERT Model

Documentation

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
BERT accuracy		BERT accuracy
ML models accuracy		ML models accuracy
NLP_Data		NLP_Data
.gitignore		.gitignore
Documentation.pdf		Documentation.pdf
NLP project.ipynb		NLP project.ipynb
NLP project.py		NLP project.py
README.md		README.md
requirements.txt		requirements.txt

Mohammed2372/Movie-Reviews-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Movie Reviews Sentiment Analysis

Overview

Project Structure

Setup and Installation

Model Training

Models

Traditional Machine Learning

BERT Model

Performance

Machine Learning Models

BERT Model

Documentation

Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages