AUC on ROC Curve near 1.0 for Multi-Class CNN but Precision/Recall are not perfect?

Question

I am building a ROC Curve and calculating AUC for multi-class classification on the CIFAR-10 dataset using a CNN. My overall Accuracy is ~ 90% and my precision and recall are as follows:
              precision    recall  f1-score   support

airplane       0.93      0.90      0.91      1000
  automobile       0.93      0.96      0.95      1000
        bird       0.88      0.87      0.87      1000
         cat       0.86      0.72      0.79      1000
        deer       0.88      0.91      0.89      1000
         dog       0.88      0.81      0.84      1000
        frog       0.83      0.97      0.89      1000
       horse       0.94      0.94      0.94      1000
        ship       0.95      0.93      0.94      1000
       truck       0.90      0.95      0.92      1000

accuracy                           0.90     10000
   macro avg       0.90      0.90      0.90     10000
weighted avg       0.90      0.90      0.90     10000

The code where I calculate the ROC Curve and AUC is below:
def assess_model_from_pb(model_file_path: Path, xtest: np.ndarray, ytest: np.ndarray, save_plot_path: Path):

class_labels = ['airplane','automobile','bird','cat','deer','dog','frog','horse','ship','truck']
    model = load_model(model_file_path) # load model from filepath
    feature_extractor = Model(inputs = model.inputs, outputs = model.get_layer('dense').output) # extract dense output layer (will be softmax probabilities)
    y_score = feature_extractor.predict(xtest, batch_size = 64) # one hot encoded softmax predictions
    ytest_binary = label_binarize(ytest, classes = [0,1,2,3,4,5,6,7,8,9]) # one hot encode the test data true labels
    n_classes = y_score.shape[2]

fpr = dict()
    tpr = dict()
    roc_auc = dict() 
    # compute fpr and tpr with roc_curve from the ytest true labels to the scores
    for i in range(n_classes):
        fpr[i], tpr[i], _ = roc_curve(ytest_binary[:, i], y_score[:, i])
        roc_auc[i] = auc(fpr[i], tpr[i])

# plot each class  curve on single graph for multi-class one vs all classification
    colors = cycle(['blue', 'red', 'green', 'brown', 'purple', 'pink', 'orange', 'black', 'yellow', 'cyan'])
    for i, color, lbl in zip(range(n_classes), colors, class_labels):
        plt.plot(fpr[i], tpr[i], color = color, lw = 1.5,
        label = 'ROC Curve of class {0} (area = {1:0.3f})'.format(lbl, roc_auc[i]))
    plt.plot([0, 1], [0, 1], 'k--', lw = 1.5)
    plt.xlim([-0.05, 1.0])
    plt.ylim([0.0, 1.05])
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title('ROC Curve for CIFAR-10 Multi-Class Data')
    plt.legend(loc = 'lower right', prop = {'size': 6})
    fullpath = save_plot_path.joinpath(save_plot_path.stem +'_roc_curve.png')
    plt.savefig(fullpath)
    plt.show()

I suppose I am just confused on how my AUC can be near 1 when my precision and recalls are not perfect. I understand that many thresholds are used to determine what is a positive class and a negative class. For example towards the beginning of the curve, if the threshold is very high (say around .99999), how is it that my tpr is near 1? Is it solely the fact that at that threshold I will only be giving positive classifications for the most absolutely high softmax probabilities?
Would just like a bit more explanation or intuition on the topic to make sure I am not doing something incorrectly.

AUC on ROC Curve near 1.0 for Multi-Class CNN but Precision/Recall are not perfect?

Add your own answers!

Ask a Question