The aim is to build a predictive model that can accurately classify whether the employee is likely to leave or the employee is likely to stay in the company. This allows companies to take proactive measures, such as improving working conditions, offering promotions, or addressing dissatisfaction, to retain valuable employees.
Based on different factors, a model is created that can classify whether the employee will leave the company or not. It is a supervised classification problem statement.
The dataset contains information about the employee leaving the company. I have recorded 14999 employees from the HR for the analysis purpose based on different features/factors that include 9 independent features and 1 output feature i.e. target variable.
- Satisfaction level: Employee satisfaction level (ranges from 0 to 1).
- Last evaluation: Last evaluation score of the employee (ranges from 0 to 1).
- Number of project: Number of projects the employee has worked on.
- Average monthly hours: Average monthly hours the employee has worked.
- Time spend company: Number of years the employee has spent in the company.
- Work accident: Whether the employee has had a work accident (0 for No, 1 for Yes).
- Left: Whether the employee has left the company (0 for No, 1 for Yes). This is the target variable we aim to predict using the other features.
- Promotion last 5 years: Whether the employee was promoted in the last 5 years (0 for No, 1 for Yes).
- Departments: Department in which the employee works. 10 Salary: Salary level of the employee (low, medium, high).
- Logistic Regression
- Decision Tree
- Random Forest