How to measure deviance resulting from different random seeds in machine learning?

Question

I'm running an xgboost model to predict probabilities to a binary classification problem. Then I aggregate the results based on the Age variable (what is the aggregated risk of getting the sickness for Age x). I made a mistake and did not set the seed number to be constant so when I rerun the model I get slightly different aggregated results. Could you give me a reference why it is not substantial? So I can avoid running the model on several seed settings to get a confidence interval? I would not want to do it because the learning process takes a few hours. Thank you in advance!

Fnguyen · Answer

Unfortunately it actually can be substantial. This nice article goes into depth about it and this question shows some clear impact as well.
So depending on the model performance, the model / algorithm used and especially the distribution of your data set you can expect the random seed to influence your results almost as much as optimizing any other parameter (~2-4% points in both examples).
You can minimize this by reducing imbalance in your data sets (e.g. resampling train and test data would have similar effects) or simply fixing your random seed and never touching it.
Now that you are where you are there is not a lot you can do. Maybe pointing at the two sources I have will let you get away with a fixed "interval" or maybe you simply use your last result.

How to measure deviance resulting from different random seeds in machine learning?

One Answer

Add your own answers!

Ask a Question