Data Science Asked by Coldchain9 on October 13, 2020
I am building a ROC Curve and calculating AUC for multi-class classification on the CIFAR-10 dataset using a CNN. My overall Accuracy is ~ 90% and my precision and recall are as follows:
precision recall f1-score support
airplane 0.93 0.90 0.91 1000
automobile 0.93 0.96 0.95 1000
bird 0.88 0.87 0.87 1000
cat 0.86 0.72 0.79 1000
deer 0.88 0.91 0.89 1000
dog 0.88 0.81 0.84 1000
frog 0.83 0.97 0.89 1000
horse 0.94 0.94 0.94 1000
ship 0.95 0.93 0.94 1000
truck 0.90 0.95 0.92 1000
accuracy 0.90 10000
macro avg 0.90 0.90 0.90 10000
weighted avg 0.90 0.90 0.90 10000
The code where I calculate the ROC Curve
and AUC
is below:
def assess_model_from_pb(model_file_path: Path, xtest: np.ndarray, ytest: np.ndarray, save_plot_path: Path):
class_labels = ['airplane','automobile','bird','cat','deer','dog','frog','horse','ship','truck']
model = load_model(model_file_path) # load model from filepath
feature_extractor = Model(inputs = model.inputs, outputs = model.get_layer('dense').output) # extract dense output layer (will be softmax probabilities)
y_score = feature_extractor.predict(xtest, batch_size = 64) # one hot encoded softmax predictions
ytest_binary = label_binarize(ytest, classes = [0,1,2,3,4,5,6,7,8,9]) # one hot encode the test data true labels
n_classes = y_score.shape[2]
fpr = dict()
tpr = dict()
roc_auc = dict()
# compute fpr and tpr with roc_curve from the ytest true labels to the scores
for i in range(n_classes):
fpr[i], tpr[i], _ = roc_curve(ytest_binary[:, i], y_score[:, i])
roc_auc[i] = auc(fpr[i], tpr[i])
# plot each class curve on single graph for multi-class one vs all classification
colors = cycle(['blue', 'red', 'green', 'brown', 'purple', 'pink', 'orange', 'black', 'yellow', 'cyan'])
for i, color, lbl in zip(range(n_classes), colors, class_labels):
plt.plot(fpr[i], tpr[i], color = color, lw = 1.5,
label = 'ROC Curve of class {0} (area = {1:0.3f})'.format(lbl, roc_auc[i]))
plt.plot([0, 1], [0, 1], 'k--', lw = 1.5)
plt.xlim([-0.05, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve for CIFAR-10 Multi-Class Data')
plt.legend(loc = 'lower right', prop = {'size': 6})
fullpath = save_plot_path.joinpath(save_plot_path.stem +'_roc_curve.png')
plt.savefig(fullpath)
plt.show()
I suppose I am just confused on how my AUC can be near 1 when my precision and recalls are not perfect. I understand that many thresholds
are used to determine what is a positive class and a negative class. For example towards the beginning of the curve, if the threshold is very high (say around .99999), how is it that my tpr
is near 1? Is it solely the fact that at that threshold I will only be giving positive classifications for the most absolutely high softmax probabilities?
Would just like a bit more explanation or intuition on the topic to make sure I am not doing something incorrectly.
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP