ML-Algorithm-Library

ML-Algorithm-Library: A curated collection of classic machine learning algorithms implemented from scratch in Python + NumPy. Designed for clarity, extensibility, and learning—empowering you to inspect, modify, and build on foundational algorithms.

Overview

Machine learning is often treated as a black box. ML-Algorithm-Library demystifies core algorithms by providing pure-Python (+ NumPy) implementations with clear, well-documented code. This fosters deep understanding: inspect every operation, trace the math-to-code mapping, experiment with variations, and extend for your research or production prototypes.

Key goals:

Educational clarity: Code is written & commented to teach intuition as well as implementation details.
Minimal dependencies: Only NumPy and standard library; avoids hiding logic behind heavy frameworks.
Consistent interface: Uniform fit / predict (or analogous) APIs across models.
Modular design: Organized by category (Regression, Classification, Clustering, etc.), making it easy to extend.
Production-awareness: While focusing on clarity, highlights performance considerations and when to scale up to optimized libraries.

Features

Core Algorithms Implemented from Scratch
- Regression: Simple Linear, Multiple Linear, Polynomial, Ridge, Lasso
- Classification: Logistic Regression, k-NN, SVM (linear & kernel), Naive Bayes, Decision Trees, Ensemble methods (Bagging, Random Forest, AdaBoost, Gradient Boosting)
- Clustering: k-Means, Hierarchical Agglomerative, Gaussian Mixture Models (EM), optional DBSCAN
- Dimensionality Reduction: PCA, LDA, SVD demonstrations
- Neural Networks: Perceptron, Multilayer Perceptron with backpropagation, Autoencoders
- Recommendation Basics: Collaborative Filtering, Content-Based Filtering, Matrix Factorization prototypes
- (Optional/Advanced) Reinforcement Learning: Multi-armed bandit, Q-learning demos
Utilities
- Data preprocessing: train-test split, normalization/standardization, one-hot encoding
- Metrics: MSE, RMSE, MAE, R², accuracy, precision/recall/F1, confusion matrix, silhouette score, etc.
- Helpers: reproducible random_state usage, logging/progress for iterative algorithms
Jupyter Notebooks & Examples
- Intuitive tutorials for each algorithm: theory → code walkthrough → visualization → “what if” experiments
- Standalone scripts for quick trials
Testing
- Pytest-based unit tests on synthetic datasets to verify correctness and edge conditions
Extensible Structure
- Encourage contributions: add new algorithms or improvements following the established template.

Badges & CI

Build Status: Automated tests run on GitHub Actions for every push/PR.
Coverage (optional): Integrate Codecov or Coveralls to monitor test coverage.
License: MIT.
Python Version Support: 3.7 and above.

Badges appear at the top; CI configuration file (.github/workflows/ci.yml) defines steps:

Checkout code
Setup Python environment
Install dependencies (numpy, pytest)
Run pytest and optionally coverage reporting.

Installation

This library is lightweight and not currently published to PyPI. You can install directly from GitHub:

# Clone repository
git clone https://github.com/AdilShamim8/ML-Algorithm-Library.git
cd ML-Algorithm-Library

# (Optional) Create a virtual environment
python -m venv venv
source venv/bin/activate   # macOS/Linux
venv\Scripts\activate      # Windows

# Install dependencies
pip install numpy
# If you plan to run tests or notebooks:
pip install pytest jupyter matplotlib

To integrate this library into another project, install in “editable” mode:

pip install -e .

(Ensure you have a setup.py or pyproject.toml configured if you choose to support pip install -e ..)

Usage

Importing and Running an Algorithm

All algorithms follow a consistent pattern:

from ml_algorithm_library.regression.simple_linear import SimpleLinearReg

# Prepare data
X_train = [1, 2, 3, 4, 5]
y_train = [2, 4, 5, 4, 5]

# Instantiate & fit
model = SimpleLinearReg()
model.fit(X_train, y_train)

# Predict
X_test = [6, 7, 8]
y_pred = model.predict(X_test)
print("Predictions:", y_pred)

Example Notebooks

Regression Tutorial: notebooks/Regression/SimpleLinearRegression.ipynb
- Intuition: fitting a line as minimizing squared errors.
- Code walkthrough: calculating slope/intercept.
- Visualization: plot data & fitted line.
- Experiments: add noise, outliers, regularization.
Classification Tutorial: notebooks/Classification/LogisticRegression.ipynb
- Derive sigmoid, loss, gradient descent.
- Compare with scikit-learn output.
- Plot decision boundary on synthetic data.
Clustering & Dimensionality Reduction: Similar structure with plots showing cluster assignments, PCA projections.

Use jupyter notebook to open and run these demos. They serve both as learning material and quick reference for usage patterns.

Repository Structure

ML-Algorithm-Library/
│
├── ml_algorithm_library/           # Core package
│   ├── __init__.py
│   ├── regression/
│   │   ├── __init__.py
│   │   ├── simple_linear.py
│   │   ├── multiple_linear.py
│   │   ├── polynomial.py
│   │   ├── ridge.py
│   │   └── lasso.py
│   ├── classification/
│   │   ├── __init__.py
│   │   ├── logistic_regression.py
│   │   ├── knn.py
│   │   ├── svm.py
│   │   ├── naive_bayes.py
│   │   ├── decision_tree.py
│   │   └── ensemble/
│   │       ├── bagging.py
│   │       ├── random_forest.py
│   │       ├── adaboost.py
│   │       └── gradient_boosting.py
│   ├── clustering/
│   │   ├── __init__.py
│   │   ├── kmeans.py
│   │   ├── hierarchical.py
│   │   └── gmm.py
│   ├── dimensionality_reduction/
│   │   ├── __init__.py
│   │   ├── pca.py
│   │   └── lda.py
│   ├── neural_networks/
│   │   ├── __init__.py
│   │   ├── perceptron.py
│   │   ├── mlp.py
│   │   └── autoencoder.py
│   ├── recommendation/
│   │   ├── __init__.py
│   │   ├── collaborative_filtering.py
│   │   └── content_based.py
│   ├── utils/
│   │   ├── __init__.py
│   │   ├── metrics.py
│   │   ├── data_preprocessing.py
│   │   └── helpers.py
│   └── ...                          # optional advanced modules
│
├── notebooks/                      # Jupyter tutorials & demos
│   ├── Regression/
│   ├── Classification/
│   ├── Clustering/
│   └── NeuralNetworks/
│
├── examples/                       # Standalone scripts (e.g., run_simple_linear.py)
│
├── tests/                          # Pytest tests
│   ├── test_simple_linear.py
│   ├── test_logistic_regression.py
│   └── ...
│
├── .github/
│   ├── workflows/
│   │   └── ci.yml                 # GitHub Actions CI config
│   ├── ISSUE_TEMPLATE.md
│   ├── PULL_REQUEST_TEMPLATE.md
│   └── CODE_OF_CONDUCT.md
│
├── requirements.txt                # numpy, pytest, jupyter, matplotlib (optional)
├── setup.py / pyproject.toml       # For packaging (optional)
├── README.md                       # This file
└── LICENSE                         # MIT License

Development & Contributing

Contributions are welcomed! Follow these guidelines to ensure consistency and high quality.

Code Style & Standards

PEP 8 compliant: readable, concise, clear.
Docstrings: Use triple-quoted docstrings. Top-level summary, followed by parameters, returns, and brief explanation of algorithm’s intuition or references.
Inline comments: Explain non-obvious steps (e.g., derivation points, vectorization decisions).
Type hints (optional): You may include type hints for clarity, but keep code beginner-friendly.
Dependencies: Only NumPy and standard library in core modules. Development dependencies (pytest, jupyter, matplotlib) listed in requirements.txt.

Branching & Workflow

Main branch: Always passing CI, stable.
Feature branches: Branch from main named feature/<algorithm-name> or fix/<issue-number>.
Pull Requests:
- Base branch: main.
- Provide clear PR title and description: what was added/changed, motivation, references.
- Link any related issue.
- Ensure all tests pass locally before opening PR.

Testing

Use pytest.
Write tests for new algorithms under tests/, e.g., test_<algorithm>.py.
Synthetic datasets: small, deterministic. Validate core functionality and edge cases.
Run locally: pytest --maxfail=1 --disable-warnings -q.
CI will run pytest on each push/PR; ensure coverage doesn’t drop significantly.

Continuous Integration

GitHub Actions config in .github/workflows/ci.yml. Example steps:
1. Checkout code.
2. Setup Python (versions: 3.7, 3.8, 3.9, 3.10+).
3. Install dependencies (pip install numpy pytest).
4. Run pytest.
5. (Optional) Report coverage to Codecov.
Keep CI fast; avoid heavy dependencies in CI. Notebooks are not executed in CI by default (to save time), but code cells can be spot-tested if desired.

Issue & PR Templates

Issue template: Encourage clear bug reports or feature requests:
- Title prefix: [BUG], [FEATURE], [DOC].
- Description: Expected behavior, actual behavior, minimal reproduction code.
PR template: Checklist for authors:
- Code follows style guidelines.
- New tests added and passing.
- Documentation/notebook updated.
- Relevant issue linked.

Templates reside under .github/ISSUE_TEMPLATE.md and .github/PULL_REQUEST_TEMPLATE.md.

Code of Conduct

A CODE_OF_CONDUCT.md in .github/ defines community standards.
Encourage respectful, inclusive collaboration.
Clearly state reporting guidelines for unacceptable behavior.

Contributor Recognition

Contributor list reflected in README via a badge or link to GitHub contributors graph.
Acknowledge significant contributions in the release notes or the acknowledgments section.

Testing Locally

Clone and activate the environment:

git clone https://github.com/AdilShamim8/ML-Algorithm-Library.git
cd ML-Algorithm-Library
python -m venv venv
source venv/bin/activate
pip install numpy pytest

Run tests:
```
pytest
```

(Optional) Run notebooks:

pip install jupyter matplotlib
jupyter notebook

Ensure new contributions include tests and docs, and pass CI.

Roadmap

Track progress via GitHub Projects or Issues. Possible milestones:

v0.1: Core Regression & Classification (Simple Linear, Multiple Linear, Logistic, k-NN, Decision Tree)
v0.2: Regularized Models & Ensembles (Ridge, Lasso, Bagging, Random Forest)
v0.3: Clustering & Dimensionality Reduction (k-Means, PCA, Hierarchical Clustering)
v0.4: Neural Networks from Scratch (Perceptron, MLP)
v0.5: Advanced Methods (Gradient Boosting basics, GMM, Recommendation prototypes)
v1.0: Stable release with comprehensive tests, documentation, and example notebooks.

Each version is tagged with semantic versioning (v0.x.y). Release notes summarizing additions, fixes, and any breaking changes.

License

This project is licensed under the MIT License. See LICENSE for full text.

Acknowledgments

Classic machine learning textbooks and open-source implementations that inspired this library.
The broader ML community’s tutorials, papers, and discussions—guiding clarity and pedagogy.
Contributors who enhance the library with new algorithms, tests, and improved explanations.

Contact & Support

Issues & Feature Requests: Open an issue in this repository.
Discussion: Use GitHub Discussions (if enabled) or open an issue with a “Discussion” label.
Maintainer: Adil Shamim (@AdilShamim8).

Stay engaged: star the Repository ⭐, share feedback, and help others learn by contributing examples or improvements.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Random Forest		Random Forest
Adaboost		Adaboost
Bagging		Bagging
DBSCAN		DBSCAN
Decision Trees		Decision Trees
Gradient Boosting		Gradient Boosting
Gradient_Descent		Gradient_Descent
Heirarchical Clustering		Heirarchical Clustering
K Nearest Neighbors		K Nearest Neighbors
KMeans Clustering		KMeans Clustering
Linear_Regression		Linear_Regression
Logistic_Regression		Logistic_Regression
Naive Bayes		Naive Bayes
Principle Component Analysis (PCA)		Principle Component Analysis (PCA)
Support Vector Machines (SVM)		Support Vector Machines (SVM)
T-sne		T-sne
Xgboost		Xgboost
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ML-Algorithm-Library

Table of Contents

Overview

Features

Badges & CI

Installation

Usage

Importing and Running an Algorithm

Example Notebooks

Repository Structure

Development & Contributing

Code Style & Standards

Branching & Workflow

Testing

Continuous Integration

Issue & PR Templates

Code of Conduct

Contributor Recognition

Testing Locally

Roadmap

License

Acknowledgments

Contact & Support

About

Uh oh!

Releases

Packages

Languages

License

AdilShamim8/ML-Algorithm-Library

Folders and files

Latest commit

History

Repository files navigation

ML-Algorithm-Library

Table of Contents

Overview

Features

Badges & CI

Installation

Usage

Importing and Running an Algorithm

Example Notebooks

Repository Structure

Development & Contributing

Code Style & Standards

Branching & Workflow

Testing

Continuous Integration

Issue & PR Templates

Code of Conduct

Contributor Recognition

Testing Locally

Roadmap

License

Acknowledgments

Contact & Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages