How to model a Machine learning problem considering links between features

Question

Context:

To predict employee turnover ( will an employee leave? ), I have used one of the classification algorithms (LDA) to train my dataset, and then make predictions.

The dataset is quite small (500 lines), some 20 features, the following are some examples:

Age
Years_Spent: Years for an employee has spent in the company.
Department (IT, Commercial, Management…)
Salary
Salary_Increase
Sale_Bonus
…

However, HR Experience tells us that:

For employees whose Years_Spent < 1.5, Salary_Increase is a feature that does not have any impact on turnover (Because of Salary_Increase > 0 only when Years_Spent >1.5).
Sale_Bonus will not have any impact on those who are not commercials. (Because IT guys will never recept sale bonus)

Here comes the problem:
If I set Salary_Increase = 0 for employees whose Years_Spent <1.5 and Sale_Bonus = 0 for those who are not commercials, the classification algorithm will take 0 as a very small value, so a possible conclusion could be drawn by an algorithm: “employeeA will leave because he never receives sale_bonus”, (However, in reality, employeeA is from IT department, employeeA receive never sale_bonus and employeeA will not leave because of that), as we see, the constructed model is not correct.

My question is: How to handle this kind of problem so that HR experience can be understood by classification algorithms?

Thank you for your patient reading and kindly welcome all sorts of discussions!

classification feature engineering machine learning machine learning model

bradS · Answer

Welcome to the site!

What you are describing above is known as an interaction.

You should consider the algorithm you wish to use and whether it allows for interactions between predictors. Some techniques - like generalised linear models - will require interactions to be stated explicitly, while tree-based algorithms will capture interactions automatically.

How to model a Machine learning problem considering links between features

One Answer

Add your own answers!

Ask a Question