This project aims to predict credit card approval using machine learning techniques. The goal is to build and evaluate various classification models to identify the best-performing model for accurate prediction.
-
Data Preparation
- Loading and merging credit card datasets (
Credit_card.csv
andCredit_card_label.csv
). - Handling missing values by removing irrelevant columns and rows.
- Checking for and removing duplicate entries.
- Loading and merging credit card datasets (
-
Exploratory Data Analysis (EDA)
- Calculating descriptive statistics to understand the data.
- Analyzing the average income by gender and visualizing with a bar plot.
- Visualizing the credit approval status distribution using a pie chart.
- Exploring relationships between features with visualizations like box plots and violin plots.
-
Visualization
Average Income by Gender
Credit Approval Status
Annual Income By Marital Status
Birthday Count By Gender
Family Members By Marital Status
Annual Income By Education
-
Feature Engineering and Selection
- Selecting relevant features for model building (
Car_Owner
,Propert_Owner
,Annual_income
,EDUCATION
,label
). - Converting categorical features to numerical representations using label encoding.
- Selecting relevant features for model building (
-
Model Building and Evaluation
Splitting the dataset into training and testing sets.- Standardizing data using
StandardScaler
. - Training and evaluating various classification models:
- Logistic Regression
- K-Nearest Neighbors (KNN) with
GridSearchCV
for hyperparameter tuning - Support Vector Machine (SVM) with
GridSearchCV
- Decision Tree with
GridSearchCV
- Random Forest with
GridSearchCV
- AdaBoost with
GridSearchCV
- Evaluating model performance using accuracy score.
- Standardizing data using
-
Results
Model | Accuracy Score |
---|---|
Logistic Regression | 0.90 |
K-Nearest Neighbors (KNN) | 0.60 |
Support Vector Classifier (SVC) | 0.80 |
Decision Tree | 0.80 |
Random Forest Classifier | 0.85 |
AdaBoost | 0.90 |
The Logistic Regression and AdaBoost models achieved the highest accuracy of 0.90, making them the best-performing models for credit card approval prediction in this project.
- Clone the repository:
- Activate the virtual environment (if created):
- Run the Jupyter Notebook:
Contributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the MIT License.