LalaLand
LalaLand

Reputation: 139

Deep Learning model not accurately predicting , Keras?

I am new to Deep Learning and Keras. I have created a model that trains on the ASL(American Sign Language) dataset with nearly 80,000 training images and 1500 testing images. I have also appended some more classes ie. Hand sign numbers from 0-9. So, in total, I have 39 classes (0-9 and A-Z). My task is to training this dataset and use it for prediction. My input for prediction would be a frame from a webcam where I'll be displaying the hand sign.

My Keras Model

classifier = Sequential()

classifier.add(Conv2D(32, (3, 3), input_shape = (100, 100, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Flatten())

classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 39, activation = 'softmax'))

classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])



from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('train',
                                                 target_size = (100,100),
                                                 batch_size = 128,
                                                 class_mode = 'categorical')

test_set = test_datagen.flow_from_directory('test',
                                            target_size = (100, 100),
                                            batch_size = 128,
                                            class_mode = 'categorical')

classifier.fit_generator(training_set,
                         steps_per_epoch = 88534,
                         epochs = 10,
                         validation_data = test_set,
                         validation_steps = 1418)

The ASL dataset images are of size 200x200 and the number sign datasets are of size 64x64. After running for 5 epocs with validation accuracy 96% I am still not able to get good predictions when I run it on a video.

python program for frames of video

classifier = load_model('asl_original.h5')
classifier.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])

cam = cv2.VideoCapture(0)

while(1):
    try:
        ret, frame = cam.read()
        frame = cv2.flip(frame,1)
        roi = frame[100:400,200:500]
        cv2.rectangle(frame,(200,100),(500,400),(0,255,0),2) 
        cv2.imshow('frame',frame) 
        cv2.imshow('roi',roi)
        img = cv2.resize(roi,(100,100))
        img = np.reshape(img,[1,100,100,3]) 
        classes = classifier.predict_classes(img)
        print(classes)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break


    except Exception:
        traceback.print_exc()
        pass

I Don't understand why am I not able to get accurate predictions even after training on such a large dataset. What changes do I need to make so that I get accurate predictions for all my 39 classes.

Link for the datasets. ASL DATASET and Hand sign for numbers

Upvotes: 0

Views: 321

Answers (1)

Simone Coslovich
Simone Coslovich

Reputation: 111

In the classifier.compile you use the loss='binary_crossentropy' that is used only where the labels are binary (only two classes). When you have multiclass classification you must use the appropriate loss function based on the numbers and types of your labels (i.e. 'sparse_categorical_crossentropy').

Try to read this useful blog post that explains every loss function in details.

Upvotes: 2

Related Questions