Predicting Employee Turnover
Using data to understand why valuable employees leave and building a tool to predict who might leave next.

The Business Challenge
Sailfort Motors, like many companies, faced a common but costly problem: valuable employees were leaving, and it wasn't always clear why. Losing an employee means more than just an empty desk; it costs time and money to recruit, hire, and train a replacement. The goal of this project was to stop guessing and start using data to find answers.
The main question was: Can we use the information we already have about our employees to predict who is at a high risk of leaving? If we can, the HR team can step in with the right support to encourage them to stay, saving the company money and keeping our best talent.
Our Approach: A Four-Step Plan
Understanding the Story in the Data
First, I dove into the employee data. This involved looking at everything from satisfaction scores and performance reviews to salary levels and how many projects someone worked on. The goal was to find patterns and clues that might explain why people leave.
Preparing the Data for Analysis
Data can be messy. I cleaned and organized all the information, making sure it was consistent and ready for a machine learning model to understand. This is a crucial step to ensure our predictions are accurate.
Building the Predictive Tool
I used a powerful technique called "Gradient Boosting" to build a predictive model. Think of it as a smart system that learns the complex patterns of employees who left in the past. It then uses that knowledge to identify current employees who show similar signs.
Testing and Confirming the Accuracy
Finally, I tested the model on data it had never seen before to make sure it could make accurate predictions. The results were excellent, confirming that the tool was reliable and ready to provide valuable insights.
Key Information
- Type: HR Analytics
- Tools: Python
- Accuracy: >98%
Technology Stack
- Python
- Pandas & NumPy
- Scikit-learn
- Matplotlib & Seaborn
- Jupyter Notebook
Key Features
- Business Focus: Identifies at-risk employees, allowing for proactive retention efforts.
- Business Focus: Provides data-backed reasons for churn, helping improve company culture.
- Technical: Utilizes an advanced Gradient Boosting model for high accuracy.
- Technical: Includes comprehensive feature importance analysis to explain the "why" behind predictions.
Mission Accomplished: From Data to Decisions
The final predictive model achieved an impressive accuracy of over 98%. This high level of precision empowers HR to move from reactive problem-solving to proactive talent retention, focusing their efforts where they matter most.
Low Satisfaction is a Major Red Flag
Employees with satisfaction scores below 0.46 were prime candidates for churn.
Workload Balance is Key
Both overworked (6-7 projects) and underutilized (2 projects) staff were high-risk.
Recognition Matters
High-performers with unexpectedly low evaluations were likely to leave.