TransWikia.com

SHAP value analysis gives different feature importance on train and test set

Data Science Asked by pbk on August 20, 2021

Should SHAP value analysis be done on the train or test set?

What does it mean if the feature importance based on mean |SHAP value| is different between the train and test set of my lightgbm model?

I intend to use SHAP analysis to identify how each feature contributes to each individual prediction and possibly identify individual predictions that are anomalous. For instance, if the individual prediction’s top (+/-) contributing features are vastly different from that of the model’s feature importance, then this prediction is less trustworthy. Does this approach make sense?

2 Answers

Since SHAP gives you an estimation of an individual sample (they are local explainers), your explanations are local(for a certain instance)

You are just comparing two different instances and getting different results. This is normal and can happen in train and test set. This doesn't mean also that your train and test set have bad split, they could be good split.

In the end SHAP is done to help you understand how the model behaves in a particular instance. It should be done where you are interested in understanding. I guess that you can also try to find what is the difference between train and test with shap values, but they are local explainers so you might not find much success.

I wouldn't say anything about the quality of predictions given the feature importance.

Correct answer by Carlos Mougan on August 20, 2021

You have to make sure that the problem doesn't come from your data or your model :

  • Make sure that your data don't change significantly (same % of classes) but also general distribution / correlation of features, correlation between features and output.

  • Make sure that your model is not overfit on your train data.

Once you have made sure of that, the idea of using SHAP to look for outliers is interesting, but might not work at all, depending on your variables / problems.

Answered by lcrmorin on August 20, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP