KL Divergence between Predictions and Ground truth

Question

I've got four (non-linear, tree-based) models in production and using the average of them as the served prediction. We get ground truth data immediately.
During training the optimized candidate models had very similar performance, so I decided to deploy all of them and take the average and served that as the prediction. With the intention of figuring which one would really be best at a later point.
That later point has come.
Out of these four models two seem fit the ground truth distribution quite well, at least by examining the KDE plot of the predicted values against the ground truth distribution
I was initially thinking of doing a pairwise 2-sample KS Test (each model against ground truth) to see if the predicted values from the models come from the same distribution as the ground truth.
But I talked myself out of it. Mostly because I have 50,000+ predictions and I figured that a sample size that large will result in a small p-value anyway.
I then turned my attention to KL divergence. Something I have never used before.
Would comparing the KL-Divergence between each model's prediction and the ground truth be a good way to assess model fit? Can I compare just the raw values or does it need to be converted to a probability?
If so, how would I go about doing this? scipy.stats.kl_div outputs an array which contains (I assume) the divergence between the prediction and ground truth.
Would I just take the sum of the array and call that the Divergence between the model and ground truth?
Hopefully I don't sound crazy or too much like I don't know what I'm doing or talking about. Because I kinda don't, but not too much.
Thanks

KL Divergence between Predictions and Ground truth

Add your own answers!

Ask a Question