Data Science Asked by yalun gao on May 9, 2021
I used the classic dataset – mnist dataset that has 784 columns of pixels and 1 column of the label (from 0 to 9), and I was going to transform the images into their corresponding seven segment display representation. The following is my code.
# Convert the labels(digits) in train_set and test_set to seven-segment display
train_set$a <- ifelse(train_set$V1 %in% c(0,2,3,5,6,7,8,9),1,0)
train_set$b <- ifelse(train_set$V1 %in% c(0,1,2,3,4,7,8,9),1,0)
train_set$c <- ifelse(train_set$V1 %in% c(0,1,3,4,5,6,7,8,9),1,0)
train_set$d <- ifelse(train_set$V1 %in% c(0,2,3,5,6,8),1,0)
train_set$e <- ifelse(train_set$V1 %in% c(0,2,6,8),1,0)
train_set$f <- ifelse(train_set$V1 %in% c(0,4,5,6,8,9),1,0)
train_set$g <- ifelse(train_set$V1 %in% c(2,3,4,5,6,8,9),1,0)
test_set$a <- ifelse(test_set$V1 %in% c(0,2,3,5,6,7,8,9),1,0)
test_set$b <- ifelse(test_set$V1 %in% c(0,1,2,3,4,7,8,9),1,0)
test_set$c <- ifelse(test_set$V1 %in% c(0,1,3,4,5,6,7,8,9),1,0)
test_set$d <- ifelse(test_set$V1 %in% c(0,2,3,5,6,8),1,0)
test_set$e <- ifelse(test_set$V1 %in% c(0,2,6,8),1,0)
test_set$f <- ifelse(test_set$V1 %in% c(0,4,5,6,8,9),1,0)
test_set$g <- ifelse(test_set$V1 %in% c(2,3,4,5,6,8,9),1,0)
# Split the given train data to train_x and train_y
# Reshaping the training pixels and labels data to arrays
train_x <- as.matrix(train_set[, 2:785])
train_x <- array_reshape(train_x, c(nrow(train_x), 784))
train_y <- as.matrix(train_set[, 786:792])
train_y <- array_reshape(train_y, c(nrow(train_y), 7))
# Split the given test data to test_x and test_y
# Reshaping the testing pixels and labels data to arrays
test_x <- as.matrix(test_set[, 2:785])
test_x <- array_reshape(test_x, c(nrow(test_x), 784))
test_y <- as.matrix(test_set[, 786:792])
test_y <- array_reshape(test_y, c(nrow(test_y), 7))
# Normalize inputs from 0-255 to 0-1
train_x <- train_x / 255
test_x <- test_x / 255
# Build the Model
image_size <- 784 # 28*28
num_classes <- 7 #7 segment display of the digits
model <- keras_model_sequential()
model %>%
#Hidden Layers
layer_dense(units = 512, activation = 'relu', input_shape = c(image_size)) %>%
layer_dropout(rate = 0.25) %>%
layer_dense(units = 256, activation = 'relu') %>%
layer_dropout(rate = 0.5) %>%
# Output Layer
layer_dense(units = num_classes, activation = 'sigmoid')
# Summary of the model
summary(model)
# Compile the neural network
model %>% compile(
loss = 'binary_crossentropy',
optimizer = 'adam',
metrics = c('accuracy')
)
# Modeling on Training Dataset
model %>% fit(
train_x, train_y,
epochs = 5, batch_size = 128,
validation_data = list(test_x, test_y)
)
# Prediction
pred <- predict_proba(model, test_x)
pred <- round(as.data.frame(pred))
test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==1 & pred$V4==1
& pred$V5==1 & pred$V6==1 & pred$V7==0,0,NA)
test_set$predict <- ifelse(pred$V1==0 & pred$V2==1 & pred$V3==1 & pred$V4==0
& pred$V5==0 & pred$V6==0 & pred$V7==0,1,test_set$predict)
test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==0 & pred$V4==1
& pred$V5==1 & pred$V6==0 & pred$V7==1,2,test_set$predict)
test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==1 & pred$V4==1
& pred$V5==0 & pred$V6==0 & pred$V7==1,3,test_set$predict)
test_set$predict <- ifelse(pred$V1==0 & pred$V2==1 & pred$V3==1 & pred$V4==0
& pred$V5==0 & pred$V6==1 & pred$V7==1,4,test_set$predict)
test_set$predict <- ifelse(pred$V1==1 & pred$V2==0 & pred$V3==1 & pred$V4==1
& pred$V5==0 & pred$V6==1 & pred$V7==1,5,test_set$predict)
test_set$predict <- ifelse(pred$V1==1 & pred$V2==0 & pred$V3==1 & pred$V4==1
& pred$V5==1 & pred$V6==1 & pred$V7==1,6,test_set$predict)
test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==1 & pred$V4==0
& pred$V5==0 & pred$V6==0 & pred$V7==0,7,test_set$predict)
test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==1 & pred$V4==1
& pred$V5==1 & pred$V6==1 & pred$V7==1,8,test_set$predict)
test_set$predict <- ifelse(pred$V1==1 & pred$V2==1 & pred$V3==1 & pred$V4==0
& pred$V5==0 & pred$V6==1 & pred$V7==1,9,test_set$predict)
confusionMatrix(factor(test_set$predict), factor(test_set$V1))
It turned out that my model always had only around 20% or 30% accuracy. However, when I used the model to do the prediction and transformed the outputs back into labels, the accuracy was quite good like roughly 85% every time.
I don’t know what part is wrong with my model. Can someone help me out? Really appreciated!
The accuracy of my model:
The accuracy of my prediction:
The dataset can be downloaded here:
https://www.kaggle.com/zalando-research/fashionmnist
Here is the seven-segment display chart:
enter image description here
It looks like you are mixing two different accuracy concepts, hence the difference in values:
Your network is currently set-up to predict a value between 0 and 1 for each label (activation = 'sigmoid'). In this case you might get an output like [0.9 0.4 0.3 ... 0.2] etc. If you use this set-up and the measure 'accuracy' Keras will infer that you want to calculate binary accuracy. This is not the same as categorical accuracy
When you run the prediction part you are using categorical accuracy.
To fix your problem change:
In this case, Keras should recognize that you want to measure categorical accuracy.
Answered by Burger on May 9, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP