Reputation: 83
So I am currently trying to build a race recognition program using a convolution neural network. I'm inputting 200px by 200px versions of the UTKFaceRegonition dataset (put my dataset on a google drive if you want to take a look). Im using 8 different classes (4 races * 2 genders) using keras and tensorflow, each having about 700 images but I have done it with 1000. The problem is when I run the network it gets at best 13.5% accuracy and about 11-12.5% validation accuracy, with a loss around 2.079-2.081, even after 50 epochs or so it won't improve at all. My current hypothesis is that it is randomly guessing stuff/not learning because 8/100=12.5%, which is about what it is getting and on other models I have made with 3 classes it was getting about 33%
I noticed the validation accuracy is different on the first and sometimes second epoch, but after that it ends up staying constant. I've increased the pixel resolution, changed amount of layers, types of layer and neurons per layer, I've tried optimizers (sgd at the normal lr and at very large and small (.1 and 10^-6) and I've tried different loss functions like KLDivergence but nothing seems to have any effect on it except KLDivergence which on one run did pretty well (about 16%) but then it flopped again. Some ideas I had are maybe theres too much noise in the dataset or maybe it has to do with the amount of dense layers, but honestly I dont know why it is not learning.
Heres the code to make the tensors
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import os
import cv2
import random
import pickle
WIDTH_SIZE = 200
HEIGHT_SIZE = 200
CATEGORIES = []
for CATEGORY in os.listdir('./TRAINING'):
CATEGORIES.append(CATEGORY)
DATADIR = "./TRAINING"
training_data = []
def create_training_data():
for category in CATEGORIES:
path = os.path.join(DATADIR, category)
class_num = CATEGORIES.index(category)
for img in os.listdir(path)[:700]:
try:
img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_COLOR)
new_array = cv2.resize(img_array,(WIDTH_SIZE,HEIGHT_SIZE))
training_data.append([new_array,class_num])
except Exception as error:
print(error)
create_training_data()
random.shuffle(training_data)
X = []
y = []
for features, label in training_data:
X.append(features)
y.append(label)
X = np.array(X).reshape(-1, WIDTH_SIZE, HEIGHT_SIZE, 3)
y = np.array(y)
pickle_out = open("X.pickle", "wb")
pickle.dump(X, pickle_out)
pickle_out = open("y.pickle", "wb")
pickle.dump(y, pickle_out)
Heres my built model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D
import pickle
pickle_in = open("X.pickle","rb")
X = pickle.load(pickle_in)
pickle_in = open("y.pickle","rb")
y = pickle.load(pickle_in)
X = X/255.0
model = Sequential()
model.add(Conv2D(256, (2,2), activation = 'relu', input_shape = X.shape[1:]))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(256, (2,2), activation = 'relu'))
model.add(Conv2D(256, (2,2), activation = 'relu'))
model.add(Conv2D(256, (2,2), activation = 'relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(256, (2,2), activation = 'relu'))
model.add(Conv2D(256, (2,2), activation = 'relu'))
model.add(Dropout(0.4))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(256, (2,2), activation = 'relu'))
model.add(Conv2D(256, (2,2), activation = 'relu'))
model.add(Dropout(0.4))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(256, (2,2), activation = 'relu'))
model.add(Conv2D(256, (2,2), activation = 'relu'))
model.add(Dropout(0.4))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(8, activation="softmax"))
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=['accuracy'])
model.fit(X, y, batch_size=16,epochs=100,validation_split=.1)
Heres a log of 10 epochs I ran.
5040/5040 [==============================] - 55s 11ms/sample - loss: 2.0803 - accuracy: 0.1226 - val_loss: 2.0796 - val_accuracy: 0.1250
Epoch 2/100
5040/5040 [==============================] - 53s 10ms/sample - loss: 2.0797 - accuracy: 0.1147 - val_loss: 2.0798 - val_accuracy: 0.1161
Epoch 3/100
5040/5040 [==============================] - 53s 10ms/sample - loss: 2.0797 - accuracy: 0.1190 - val_loss: 2.0800 - val_accuracy: 0.1161
Epoch 4/100
5040/5040 [==============================] - 53s 11ms/sample - loss: 2.0797 - accuracy: 0.1173 - val_loss: 2.0799 - val_accuracy: 0.1107
Epoch 5/100
5040/5040 [==============================] - 52s 10ms/sample - loss: 2.0797 - accuracy: 0.1183 - val_loss: 2.0802 - val_accuracy: 0.1107
Epoch 6/100
5040/5040 [==============================] - 52s 10ms/sample - loss: 2.0797 - accuracy: 0.1226 - val_loss: 2.0801 - val_accuracy: 0.1107
Epoch 7/100
5040/5040 [==============================] - 52s 10ms/sample - loss: 2.0797 - accuracy: 0.1238 - val_loss: 2.0803 - val_accuracy: 0.1107
Epoch 8/100
5040/5040 [==============================] - 54s 11ms/sample - loss: 2.0797 - accuracy: 0.1169 - val_loss: 2.0802 - val_accuracy: 0.1107
Epoch 9/100
5040/5040 [==============================] - 52s 10ms/sample - loss: 2.0797 - accuracy: 0.1212 - val_loss: 2.0803 - val_accuracy: 0.1107
Epoch 10/100
5040/5040 [==============================] - 53s 11ms/sample - loss: 2.0797 - accuracy: 0.1177 - val_loss: 2.0802 - val_accuracy: 0.1107
So yeah, any help on why my network seems to be just guessing? Thank you!
Upvotes: 3
Views: 1176
Reputation: 11397
The problem lies in the design of you network.
valid
padding).Here's a very basic network that gets me after 40 seconds and 10 epochs over 95% training accuracy:
import pickle
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
pickle_in = open("X.pickle","rb")
X = pickle.load(pickle_in)
pickle_in = open("y.pickle","rb")
y = pickle.load(pickle_in)
X = X/255.0
model = Sequential()
model.add(Conv2D(16, (5,5), activation = 'relu', input_shape = X.shape[1:], padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(32, (3,3), activation = 'relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64, (3,3), activation = 'relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(512))
model.add(Dense(8, activation='softmax'))
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=['accuracy'])
Architecture:
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_19 (Conv2D) (None, 200, 200, 16) 1216
_________________________________________________________________
max_pooling2d_14 (MaxPooling (None, 100, 100, 16) 0
_________________________________________________________________
conv2d_20 (Conv2D) (None, 100, 100, 32) 4640
_________________________________________________________________
max_pooling2d_15 (MaxPooling (None, 50, 50, 32) 0
_________________________________________________________________
conv2d_21 (Conv2D) (None, 50, 50, 64) 18496
_________________________________________________________________
max_pooling2d_16 (MaxPooling (None, 25, 25, 64) 0
_________________________________________________________________
flatten_4 (Flatten) (None, 40000) 0
_________________________________________________________________
dense_7 (Dense) (None, 512) 20480512
_________________________________________________________________
dense_8 (Dense) (None, 8) 4104
=================================================================
Total params: 20,508,968
Trainable params: 20,508,968
Non-trainable params: 0
Training:
Train on 5040 samples, validate on 560 samples
Epoch 1/10
5040/5040 [==============================] - 7s 1ms/sample - loss: 2.2725 - accuracy: 0.1897 - val_loss: 1.8939 - val_accuracy: 0.2946
Epoch 2/10
5040/5040 [==============================] - 6s 1ms/sample - loss: 1.7831 - accuracy: 0.3375 - val_loss: 1.8658 - val_accuracy: 0.3179
Epoch 3/10
5040/5040 [==============================] - 6s 1ms/sample - loss: 1.4857 - accuracy: 0.4623 - val_loss: 1.9507 - val_accuracy: 0.3357
Epoch 4/10
5040/5040 [==============================] - 6s 1ms/sample - loss: 1.1294 - accuracy: 0.6028 - val_loss: 2.1745 - val_accuracy: 0.3250
Epoch 5/10
5040/5040 [==============================] - 6s 1ms/sample - loss: 0.8060 - accuracy: 0.7179 - val_loss: 3.1622 - val_accuracy: 0.3000
Epoch 6/10
5040/5040 [==============================] - 6s 1ms/sample - loss: 0.5574 - accuracy: 0.8169 - val_loss: 3.7494 - val_accuracy: 0.2839
Epoch 7/10
5040/5040 [==============================] - 6s 1ms/sample - loss: 0.3756 - accuracy: 0.8813 - val_loss: 4.9125 - val_accuracy: 0.2643
Epoch 8/10
5040/5040 [==============================] - 6s 1ms/sample - loss: 0.3001 - accuracy: 0.9036 - val_loss: 5.6300 - val_accuracy: 0.2821
Epoch 9/10
5040/5040 [==============================] - 6s 1ms/sample - loss: 0.2345 - accuracy: 0.9337 - val_loss: 5.7263 - val_accuracy: 0.2679
Epoch 10/10
5040/5040 [==============================] - 6s 1ms/sample - loss: 0.1549 - accuracy: 0.9581 - val_loss: 7.3682 - val_accuracy: 0.2732
As you can see, validation score is terrible, but the point was to demonstrate that poor architecture can prevent training altogether.
Upvotes: 4