TransWikia.com

How to improve accuracy in the following code?

Data Science Asked by Shreyas Mishra on June 29, 2021

I have the about 43 different categories of traffic signs images data. If I am using the small data of 3 categories the maximum accuracy I am getting is around 65% and I have tried a lot of different layer values as well as losses and optimizers. When I am using the complete dataset I am only getting accuracy of about 5%. Please give some pointers on how to improve my accuracy. Please tell me should I add more layers and if somebody has any good example somewhere, please provide the link.

import cv2
import numpy as np
import os
import sys
import tensorflow as tf
import glob

from sklearn.model_selection import train_test_split

EPOCHS = 10
IMG_WIDTH = 30
IMG_HEIGHT = 30
NUM_CATEGORIES = 10
TEST_SIZE = 0.4     
path = []
data = []
label = []

def main():

    # Check command-line arguments
    if len(sys.argv) not in [2, 3]:
        sys.exit("Usage: python traffic.py data_directory [model.h5]")

    # Get image arrays and labels for all image files
    images, labels = load_data(sys.argv[1])

    # Split data into training and testing sets
    labels = tf.keras.utils.to_categorical(labels)
    x_train, x_test, y_train, y_test = train_test_split(
        np.array(images), np.array(labels), test_size=0.2
    )
    # Get a compiled neural network
    model = get_model()

    # Fit model on training data
    model.fit(x_train, y_train, epochs=EPOCHS)

    # Evaluate neural network performance
    model.evaluate(x_test,  y_test, verbose=2)

    # Save model to file
    if len(sys.argv) == 3:
        filename = sys.argv[2]
        model.save(filename)
        print(f"Model saved to {filename}.")


def load_data(data_dir):
    """
    Load image data from directory `data_dir`.

    Assume `data_dir` has one directory named after each category, numbered
    0 through NUM_CATEGORIES - 1. Inside each category directory will be some
    number of image files.

    Return tuple `(images, labels)`. `images` should be a list of all
    of the images in the data directory, where each image is formatted as a
    numpy ndarray with dimensions IMG_WIDTH x IMG_HEIGHT x 3. `labels` should
    be a list of integer labels, representing the categories for each of the
    corresponding `images`.
    """
    subdirs = [x[0] for x in os.walk('C:UsersLENOVOtrafficgtsrb')]
    for i in range(1,NUM_CATEGORIES+1):
        subdirs[i] = subdirs[i] + "*.*"
    for i in range(1,NUM_CATEGORIES+1):
        for file in glob.glob(subdirs[i]):
            a = cv2.imread(file)
            a = cv2.resize(a,(IMG_WIDTH,IMG_HEIGHT))
            data.append(a)
            label.append(i-1)

    return(data,label)

def get_model():
    """
    Returns a compiled convolutional neural network model. Assume that the
    `input_shape` of the first layer is `(IMG_WIDTH, IMG_HEIGHT, 3)`.
    The output layer should have `NUM_CATEGORIES` units, one for each category.
    """
    model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(
        32, (3, 3), activation="relu", input_shape=(30,30,3)
    ),
    tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.5),

    # Add an output layer with output units for all 10 digits
    tf.keras.layers.Dense(NUM_CATEGORIES, activation="softmax")])
    model.compile(
    optimizer="Nadam",
    loss="categorical_crossentropy",
    metrics=["accuracy"])
    return (model)
    raise NotImplementedError


if __name__ == "__main__":
    main()

2 Answers

Adding more images in the training set is one way to increase accuracy. You can also do transfer learning, i.e. using layers trained by a larger dataset if you are classifying common things such as animals which have networks that have already been trained such as the ResNet50.

When you have enough images that the accuracy does not increase anymore, you can try adding more layers, with convolutional and ReLu layers alternating. If you train for many epochs and the model starts to overfit i.e. training set accuracy still trends up but test set accuracy trends down, you can add layers that reduce overfitting, such as the dropout layer. You should first get the model to overfit and then try reducing overfitting.

If you don't have enough training data you could also do augmentation, e.g. adding rotated, shifted and horizontally flipped images to the training set.

The thing with traffic signs is that if you are classifying images with also the landscape instead of just the cropped traffic signs, you are going to have problems. You should then also detect the traffic signs (YOLO v3 could be used as transfer learning) instead of just classifying them.

Here's also a good example of traffic sign classification using transfer learning on Kaggle: https://www.kaggle.com/valentynsichkar/traffic-signs-classification-with-cnn

Answered by kkz on June 29, 2021

In order to improve the accuracy, you first have to understand the current issues. Without having looked into the dataset, I assume the following problems occur:

1) Your neural network is too simple in its structure.

2) Number of images is rather small

3) Number of images per classes is not balanced

Once these issues are well-understood, you have a chance of improving the accuracy.

Let's discuss these issues in detail.

Issue 1) (Your neural network is too simple in its structure):

Your defined network is a feedforward CNN. It applies 32 different convolutions, followed by the relu function to the input image. As a result you obtain 32 so-called feature-maps. The interpretation is that each feature map is attracted to some image feature which is necessary to perform the classification task. For example one such convolution could correspond to a Prewitt-Filter, which outputs horizontal or vertical edges present in the image. Another filter map could correspond to detecting round corners, arcs and so on. The final fully connected layer than classifies the image, based on these generated features.

In a deep CNN, it has been observered that the feature maps in the early layers correspond to low-level features (corners, lines, ...), while the feature maps in the later layer are attracted to high-level features (compositions of low-level features, e.g. squares, circles, faces,...) as each convolution is applied to the "image" generated by the previous layer (which is not the input image, but a feature map). Accordingly, it is essential to have multiple layers in order to achieve this hierarchical construction of features, which turned out to be very effective (and is similar to how the human brain works in some areas).

Therefore, one direction to improve accuracy is to to add more layers of convolutions (and poolings). While this turned out to be already sufficient for many task in order to achieve very high accuracies, this is probably not sufficient in your case, due to problem 2)

Issue 2) Number of images is rather small:

Theoretically, If you use a deep CNN, there should be some weights which provide very good accuracies. This probably will not work in practice in your case mainly due to:

  • all known optimization methods, which are used to obtain (train) the weights of the CNN, are not delivering a global optimal solution.

  • even in the case we had the global optimal solution and even, if we had 100% accuracy on the train set, it does not imply the CNN performs good on unseen images.

It is verly likely that you would obtain weights, which lead to bad test accuracies, even if you obtain the correct classifications on the training set. Mainly, the reason is that very likely, the learned weights do not correspond to plausible "explanations" for the general task (classification if traffic signs) but are some "shortcuts".

Example: Let's assume all images of the stop sign have been taken using camera A and all images of a "keep right" sign have been taken using camera B. Now if the sensor if camera A has some defect, e.g. the center pixel is always black, optimizing the weights will thus likely lead to the classification of stop sign by looking at the center pixel. However,once you have an image of a stop sign in the test set taken from another camera this will lead to a misclassification.

In general, if you are using a deep CNN, there are many variables that need to be optimized from the training data (among others, the kernel weights of the convolutions). In order to avoid classifications based on unplausible reasonings, it is crucial to have many training images such that no "shortcuts" are possible. In the example, it would be crucial to take images from different cameras such that hopefully the "explanations" or the features learned by the CNN are not some special characteristics of the training data but focus on a plausible model (traffic sign is an octagon with red background and text..).

In case that you do not have enough training data, one option is to use transfer learning, that means we use a pre-trained CNN model (trained on another task, but with a hugh dataset), such that the features in the early layers are attractive to low-level features. Then you use the training only the adapt the high-level features to your specific classification task. This is less prone to short-cuts as there are already meaningful features in the early layers with are feed forward in the network.

Another option is to use another ML model, e.g. Random Forest, which does not need so many training samples as the features are not learned. In this case, features have to be defined or learned in another way.

Issue 3) There are several ways to handle class imbalances for Random Forest and Neural network. Most of them affect how you sample your data.

Answered by Graph4Me Consultant on June 29, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP