sklearn "balanced_accuracy_score" sample_weights not working

Question

I wanted a metric where I could weigh each class as I wish while measuring "total accuracy". sklearn seems to have this with balanced_accuracy_score. Irrespective of the sample_weight, I am getting the same "balanced accuracy". Why? what was the point of sample_weights?
from sklearn.metrics import balanced_accuracy_score

sample_weight = np.array([1 if i == 0 else 1000 for i in y])
balanced_accuracy_score(y,m.predict(xs),sample_weight=sample_weight)

Here are the docs.

Itamar Mushkin · Answer

The point of sample_weights is to give weights to specific sample (e.g. by their importance or certainty); not to specific classes. 
Apparently, the "balanced accuracy" is (from the user guide):

the macro-average of recall scores per class

So, since the score is averaged across classes - only the weights within class matters, not between classes... and your weights are the same within class, and change only across classes.
Explicitly (from the user guide again):

$$hat{w}_i = frac{w_i}{sum_j{1(y_j = y_i) w_j}}$$

i.e. the i-th sample is re-weighted by dividing its weight by the total weights of samples with the same label.
Now, if you want, you can just use the simple accuracy score, and plug in weights as you see fit.
In the following example:
from sklearn.metrics import balanced_accuracy_score, accuracy_score

y_true = [0, 1, 0, 0, 1, 0, 1, 1, 1, 1]
y_pred = [0, 1, 0, 0, 0, 1, 1, 1, 1, 1]

some_sample_weights =[10, 1, 1, 1, 10, 1, 0.5, 0.5, 0.5, 0.5]
weights_by_class =[1 if y==1 else 1000 for y in y_true]

print('with some weights: {:.2f}'.format(balanced_accuracy_score(y_true, y_pred, sample_weight=some_sample_weights)))
print('without weights: {:.2f}'.format(balanced_accuracy_score(y_true, y_pred)))
print('with class weights in balanced accuracy score: {:.2f}'.format(balanced_accuracy_score(y_true, y_pred, sample_weight=weights_by_class)))
print('with class weights in accuracy score: {:.5f}'.format(accuracy_score(y_true, y_pred, sample_weight=weights_by_class)))

class_sizes = [sum((1 for y in y_true if y==x))/len(y_true) for x in (0,1)] 
weights_by_class_manually_balanced = [w/class_sizes[y] for w, y in zip(weights_by_class, y_true)]

print('with class weights in accuracy score (manually balanced): {:.5f}'.format(accuracy_score(y_true, y_pred, sample_weight=weights_by_class_manually_balanced)))

you get:
with some weights: 0.58
without weights: 0.79
with class weights in balanced accuracy score: 0.79
with class weights in accuracy score: 0.75012
with class weights in accuracy score (manually balanced): 0.75008

As you can see:

using class weights in the balanced accuracy score didn't matter; they just got adjusted back into class sizes.
using class weights in the accuracy score is very close to 75% (3 of out of 4 the 0 labels are correctly classified), and re-adjusting the weights according to class sizes doesn't matter much (the accuracy is a bit less, because the 0 class is larger)

sklearn "balanced_accuracy_score" sample_weights not working

One Answer

Add your own answers!

Ask a Question