TransWikia.com

Using keras in R to perform neural network, my model has very low accuracy but the prediction is good and I don't know why

Data Science Asked by yalun gao on May 9, 2021

I used the classic dataset – mnist dataset that has 784 columns of pixels and 1 column of the label (from 0 to 9), and I was going to transform the images into their corresponding seven segment display representation. The following is my code.

# Convert the labels(digits) in train_set and test_set to seven-segment display
train_set$a <- ifelse(train_set$V1 %in% c(0,2,3,5,6,7,8,9),1,0)
train_set$b <- ifelse(train_set$V1 %in% c(0,1,2,3,4,7,8,9),1,0)
train_set$c <- ifelse(train_set$V1 %in% c(0,1,3,4,5,6,7,8,9),1,0)
train_set$d <- ifelse(train_set$V1 %in% c(0,2,3,5,6,8),1,0)
train_set$e <- ifelse(train_set$V1 %in% c(0,2,6,8),1,0)
train_set$f <- ifelse(train_set$V1 %in% c(0,4,5,6,8,9),1,0)
train_set$g <- ifelse(train_set$V1 %in% c(2,3,4,5,6,8,9),1,0)

test_set$a <- ifelse(test_set$V1 %in% c(0,2,3,5,6,7,8,9),1,0)
test_set$b <- ifelse(test_set$V1 %in% c(0,1,2,3,4,7,8,9),1,0)
test_set$c <- ifelse(test_set$V1 %in% c(0,1,3,4,5,6,7,8,9),1,0)
test_set$d <- ifelse(test_set$V1 %in% c(0,2,3,5,6,8),1,0)
test_set$e <- ifelse(test_set$V1 %in% c(0,2,6,8),1,0)
test_set$f <- ifelse(test_set$V1 %in% c(0,4,5,6,8,9),1,0)
test_set$g <- ifelse(test_set$V1 %in% c(2,3,4,5,6,8,9),1,0)

# Split the given train data to train_x and train_y
# Reshaping the training pixels and labels data to arrays
train_x <- as.matrix(train_set[, 2:785])
train_x <- array_reshape(train_x, c(nrow(train_x), 784))
train_y <- as.matrix(train_set[, 786:792])
train_y <- array_reshape(train_y, c(nrow(train_y), 7))

# Split the given test data to test_x and test_y
# Reshaping the testing pixels and labels data to arrays
test_x <- as.matrix(test_set[, 2:785])
test_x <- array_reshape(test_x, c(nrow(test_x), 784))
test_y <- as.matrix(test_set[, 786:792])
test_y <- array_reshape(test_y, c(nrow(test_y), 7))

# Normalize inputs from 0-255 to 0-1
train_x <- train_x / 255
test_x <- test_x / 255

# Build the Model
image_size <- 784 # 28*28
num_classes <- 7 #7 segment display of the digits
model <- keras_model_sequential() 
model %>%
  #Hidden Layers
  layer_dense(units = 512, activation = 'relu', input_shape = c(image_size)) %>%
  layer_dropout(rate = 0.25) %>%
  layer_dense(units = 256, activation = 'relu') %>%
  layer_dropout(rate = 0.5) %>%
  # Output Layer
  layer_dense(units = num_classes, activation = 'sigmoid')

# Summary of the model
summary(model)

# Compile the neural network
model %>% compile(
  loss = 'binary_crossentropy',
  optimizer = 'adam',
  metrics = c('accuracy')
)

# Modeling on Training Dataset
model %>% fit(
  train_x, train_y,
  epochs = 5, batch_size = 128,
  validation_data = list(test_x, test_y)
)
# Prediction
pred <- predict_proba(model, test_x)
pred <- round(as.data.frame(pred))

test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==1 & pred$V4==1
                           & pred$V5==1 & pred$V6==1 & pred$V7==0,0,NA)

test_set$predict <- ifelse(pred$V1==0 & pred$V2==1 & pred$V3==1 & pred$V4==0
                           & pred$V5==0 & pred$V6==0 & pred$V7==0,1,test_set$predict)

test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==0 & pred$V4==1
                           & pred$V5==1 & pred$V6==0 & pred$V7==1,2,test_set$predict)

test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==1 & pred$V4==1
                           & pred$V5==0 & pred$V6==0 & pred$V7==1,3,test_set$predict)

test_set$predict <- ifelse(pred$V1==0 & pred$V2==1 & pred$V3==1 & pred$V4==0
                           & pred$V5==0 & pred$V6==1 & pred$V7==1,4,test_set$predict)

test_set$predict <- ifelse(pred$V1==1 & pred$V2==0 & pred$V3==1 & pred$V4==1
                           & pred$V5==0 & pred$V6==1 & pred$V7==1,5,test_set$predict)

test_set$predict <- ifelse(pred$V1==1 & pred$V2==0 & pred$V3==1 & pred$V4==1
                           & pred$V5==1 & pred$V6==1 & pred$V7==1,6,test_set$predict)

test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==1 & pred$V4==0
                           & pred$V5==0 & pred$V6==0 & pred$V7==0,7,test_set$predict)

test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==1 & pred$V4==1
                           & pred$V5==1 & pred$V6==1 & pred$V7==1,8,test_set$predict)

test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==1 & pred$V4==0
                           & pred$V5==0 & pred$V6==1 & pred$V7==1,9,test_set$predict)

confusionMatrix(factor(test_set$predict), factor(test_set$V1))

It turned out that my model always had only around 20% or 30% accuracy. However, when I used the model to do the prediction and transformed the outputs back into labels, the accuracy was quite good like roughly 85% every time.
I don’t know what part is wrong with my model. Can someone help me out? Really appreciated!
The accuracy of my model:
enter image description here
The accuracy of my prediction:
enter image description here
The dataset can be downloaded here:
https://www.kaggle.com/zalando-research/fashionmnist
Here is the seven-segment display chart:
enter image description here

One Answer

It looks like you are mixing two different accuracy concepts, hence the difference in values:

  1. Your network is currently set-up to predict a value between 0 and 1 for each label (activation = 'sigmoid'). In this case you might get an output like [0.9 0.4 0.3 ... 0.2] etc. If you use this set-up and the measure 'accuracy' Keras will infer that you want to calculate binary accuracy. This is not the same as categorical accuracy

  2. When you run the prediction part you are using categorical accuracy.

To fix your problem change:

  1. set the final layer activation function to 'softmax'
  2. use 'categorical cross entropy' loss function

In this case, Keras should recognize that you want to measure categorical accuracy.

Answered by Burger on May 9, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP