Data Science Asked by jakes on November 18, 2020
I’m trying to build CNN to predict two-label classification problem. Unfortuenetely, I can’t share my model architecture, but I compiled the model using:
model.compile(optimizer=optimizers.Adam(lr=1e-3),
loss='binary_crossentropy',
metrics=['accuracy'])
Once the training was finished, I plotted log-loss and accuracy scores on train vs validationd dataset. Log-loss looks like below:
so in my layman’s eyes this looks perfectly fine. On the other hand, accuracy scores (see below, the labels are misleading) are erratic and Keras reports accuracy of 62.8% on the training set and 62.5% on the validation set at the end of last epoch.
This seems disturbingly low for me and the strong fluctuations on the validation set are concerning as well. I use a batch size of 32, so it may partly explain these flucutations but I don’t think that’s the reason behind the whole variance (although I may be entirely wrong…). However, when I calculated accuracy using sklearn accuracy_score
function on my validation set:
from sklearn.metrics import accuracy_score
Y_prob = model.predict(X_test)
Y_pred = (Y_prob > 0.5).astype(float)
accuracy_score(Y_test, Y_pred)
I got a value of 0.99811817183879
and the contingency table of each of two labels looks like below:
import pandas as pd
pd.crosstab(Y_test[:, 0], Y_pred[:, 0], normalize=True)
col_0 0.0 1.0
row_0
0.0 0.969804 0.000680
1.0 0.000680 0.028837
pd.crosstab(Y_test[:, 1], Y_pred[:, 1], normalize=True)
col_0 0.0 1.0
row_0
0.0 0.981739 0.000296
1.0 0.000227 0.017738
I’m totally confused about this discrepancy. Therefore, I’d like to ask:
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP