Loan Default Risk Analysis

Dataset description

The dataset is from the U.S. Small Business Administration (SBA) The U.S. SBA was founded in 1953 on the principle of promoting and assisting small enterprises in the U.S. credit market (SBA Overview and History, US Small Business Administration (2015)). Small businesses have been a primary source of job creation in the United States; therefore, fostering small business formation and growth has social benefits by creating job opportunities and reducing unemployment. There have been many success stories of start-ups receiving SBA loan guarantees such as FedEx and Apple Computer. However, there have also been stories of small businesses and/or start-ups that have defaulted on their SBA-guaranteed loans.

The project will include following tasks:

Load dataset. Don't use "index" column for training.
Clean up the data:
- Encode/replace missing values
- Replace features values that appear incorrect
Encode categorical variables
Split dataset to Train/Validation/Test
Add engineered features
Train and tune ML model
Provide final metrics using Test dataset
Provide a scoring function that can be used to score new data. You can test your scoring function on the provided "scoring" dataset.

The goal of using Linear models is to be able to interpret the results via coefficients, and PCA/TruncatedSVD will make use of coefficients unusable for interpretation.

Types of models to train

Your final submission should include single model. The model set you should try to come up with best model per type of model:

Identify best model from: Sklearn Logistic Regression - try all combinations of regularization
Identify best model from: H2O-3 GLM - try different combinations of regularization

Evaluation metric: AUCPR

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Cover pic.jpeg		Cover pic.jpeg
Loan_PROJECT.zip		Loan_PROJECT.zip
PROJECT_TRAIN.ipynb		PROJECT_TRAIN.ipynb
README.md		README.md
SBA_loans_project_1_holdout_students_valid.csv		SBA_loans_project_1_holdout_students_valid.csv
Scoring_function.ipynb		Scoring_function.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Loan Default Risk Analysis

Dataset description

Types of models to train

About

Uh oh!

Releases

Packages

Languages

Lucky-akash321/Credit-Loan-Risk-Analysis-and-Prediction

Folders and files

Latest commit

History

Repository files navigation

Loan Default Risk Analysis

Dataset description

Types of models to train

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages