Keras - Single pixel classification without ImageDataGenerator

This is a newbie question, but I just can't get the simplest Keras experiment work. I went through a course and its samples work well, so my computer is set up correctly.

I've a few thousand 16x12 images, called "GGYRBGBBW.png", "BBYWBRBBB.png" and so on. The images have a single color with minimal shade differences and are reduced to a single pixel when loading. The first character of the filenames serves as training labels (e.g. green images' names start with 'G'). I need to build and train a simple model that indicates one of the 6 possible colors in the image. (This is a first learning step towards a much more complicated project).

I don't want to use ImageDataGenerator because this whole project will grow beyond that a simple directory-structure categorization can do, and I'll do the image randomization with my own external image generator.

I created a Keras model that looks like this:

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
flatten_1 (Flatten)          (None, 3)                 0
_________________________________________________________________
dense_1 (Dense)              (None, 6)                 24
=================================================================
Total params: 24
Trainable params: 24
Non-trainable params: 0

So, pretty simple - input shape is only three values (RGB), flattened, then 6 output neurons for each color category.

When running the experiments, beginning with random initial values, accuracy stays super low, sometimes 0.

210/210 [==============================] - 0s 2ms/step - loss: 7.2430 - acc: 0.2095 - val_loss: 6.7980 - val_acc: 0.2000
Epoch 2/50
210/210 [==============================] - 0s 10us/step - loss: 7.2411 - acc: 0.2095 - val_loss: 9.6617 - val_acc: 0.2000
Epoch 3/50
210/210 [==============================] - 0s 5us/step - loss: 9.9256 - acc: 0.2095 - val_loss: 9.6598 - val_acc: 0.2000
Epoch 4/50
210/210 [==============================] - 0s 5us/step - loss: 9.9236 - acc: 0.2095 - val_loss: 9.6579 - val_acc: 0.2000
Epoch 5/50
210/210 [==============================] - 0s 10us/step - loss: 9.9217 - acc: 0.2095 - val_loss: 9.6560 - val_acc: 0.2000
Epoch 6/50
210/210 [==============================] - 0s 10us/step - loss: 9.9197 - acc: 0.2095 - val_loss: 9.6541 - val_acc: 0.2000

I must be missing something trivial, but since I'm a noob in this, I can't figure out what. Here's the complete source code:

from __future__ import print_function

import random
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
os.system('cls')

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras import backend as K
from  keras.preprocessing.image import img_to_array
import numpy as np

num_classes = 6
batch_size = 300
epochs = 50
Image_width, Image_height = 1, 1
train_dir = './OneColorTrainingData'
colors="WYBGRO"


def load(numOfPics):
    X = []
    y = []
    print("Reading training images")
    allFiles = os.listdir(train_dir)
    randomFiles = random.choices(allFiles, k=numOfPics)

    for f in randomFiles:
        path = os.path.join(train_dir, f)
        img = keras.preprocessing.image.load_img(path, grayscale=False, target_size=(Image_width, Image_width))
        img = img_to_array(img)
        img /= 255
        X.append(img)
        y.append(colors.index(f[0]))

    y = keras.utils.to_categorical(y, num_classes=num_classes)
    return X, y

Data, labels = load(batch_size)
print(str(len(Data)) + " training files loaded")
print(labels)

model = Sequential()
model.add(Flatten(input_shape=(Image_height, Image_width, 3)))
model.add(Dense(num_classes, activation=K.tanh))

model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy'])

print(model.input_shape)
print(model.summary())

hist = model.fit(np.array(Data), np.array(labels), batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.3)

score = model.evaluate(np.array(Data), np.array(labels), verbose=0)
print('Test loss: ', score[0])
print('Test accuracy: ', score[1])

Any help would be appreciated.

Upvotes: 2

Views: 782

Answers (3)

OK, I figured it out. I needed to switch to softmax (thanks all who suggested), and use much more epochs because convergence was random and slow. I didn't think I needed to use more epochs because nothing changed after the first few - but it turned out that once I had 500 instead of 50, the network managed to learn all the colors one by one and hit 100% accuracy (with a single output layer) almost every time. Thank you all for the help!

Upvotes: 0

Engineero
Engineero

Reputation: 12938

I'm not positive, but I think you need a hidden layer. You currently have a single affine transformation on your input: h_0 = W_0*x + b_0. This is passed through a K.tanh, so your logits are simply y = tanh(h_0). I have a hunch that for a small enough problem like this you may be able to prove that this cannot converge, but I am not certain.

I would just add a second dense layer, and use softmax for your final output as @nuric suggests:

model = Sequential()
model.add(Flatten(input_shape=(Image_height, Image_width, 3)))
model.add(Dense(10, activation=K.tanh))
model.add(Dense(num_classes, activation='softmax'))

Upvotes: 0

nuric
nuric

Reputation: 11225

You have K.tanh as your finaly Dense layer activation. When doing 1-of-many classification we often use softmax instead which will produce a probability distribution over the colour classes:

model.add(Dense(num_classes, activation='softmax'))

Now your target labels will be one-hot vectors [0,0,1,0,0,0] indication which class it is. You can also use sparse_categorical_crossentropy loss and give labels as class integers 2 as your target. It means the same thing in this context.

Upvotes: 1

Related Questions