Reputation: 2411
I am quite new to both Python and Machine Learning and I am working on my first real project for image recognition. It is based upon this tutorial which only has two classifications (cat or dog) and has a LOT more data. Nonetheless, I am not getting my multi-class script to work in terms of it predicting correctly but mainly how to troubleshoot the script. The script is nowhere near in predicting correctly.
Below is the script. The data/images consist of 7 folders with about 10-15 images each. The images are 100x100px of different domino tiles and one folder are just baby photos (mainly as a control group because they are very different to the domino photos):
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.models import model_from_json
import numpy
import os
# Initialising the CNN
classifier = Sequential()
# Step 1 - Convolution
classifier.add(Conv2D(32, (25, 25), input_shape = (100, 100, 3), activation = 'relu'))
# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Adding a second convolutional layer
classifier.add(Conv2D(32, (25, 25), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Step 3 - Flattening
classifier.add(Flatten())
# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 7, activation = 'sigmoid')) # 7 units equals amount of output categories
# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Part 2 - Fitting the CNN to the images
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
target_size = (100, 100),
batch_size = 32,
class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('dataset/test_set',
target_size = (100, 100),
batch_size = 32,
class_mode = 'categorical')
classifier.fit_generator(training_set,
steps_per_epoch = 168,
epochs = 35,
validation_data = test_set,
validation_steps = 3)
classifier.summary()
# serialize weights to HDF5
classifier.save_weights("dominoweights.h5")
print("Saved model to disk")
# Part 3 - Making new predictions
import numpy as np
from keras.preprocessing import image
path = 'dataset/prediction_images/' # Folder with my images
for filename in os.listdir(path):
if "jpg" in filename:
test_image = image.load_img(path + filename, target_size = (100, 100))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
print result
training_set.class_indices
folder = training_set.class_indices.keys()[(result[0].argmax())] # Get the index of the highest predicted value
if folder == '1':
prediction = '1x3'
elif folder == '2':
prediction = '1x8'
elif folder == '3':
prediction = 'Baby'
elif folder == '4':
prediction = '5x7'
elif folder == '5':
prediction = 'Upside down'
elif folder == '6':
prediction = '2x3'
elif folder == '7':
prediction = '0x0'
else:
prediction = 'Unknown'
print "Prediction: " + filename + " seems to be " + prediction
else:
print "DSSTORE"
print "\n"
Explanations:
dataset/prediction_images/
contains about 10 different images that the script will predictresult
typically outputs array([[0., 0., 1., 0., 0., 0., 0.]], dtype=float32)
My question(s)
My main question is: Do you see anything particularly wrong with the script? Or, should the script be working fine and that it's just the lack of data that makes the prediction wrong?
Subquestions:
The entire section with:
classifier.fit_generator(training_set,
steps_per_epoch = 168,
epochs = 35,
validation_data = test_set,
validation_steps = 3)
puzzles me. As far as I understood, steps_per_epoch
should be the number of training images I have. Is that correct? Are epochs
the amount of iterations the CNN does?
I don't see why this code is needed:
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
it seems to me that it is creating copies/versions of the images, zoom in on them, flips them etc. Why would this be needed?
Any tips on this would help me immensely!
Upvotes: 0
Views: 1078
Reputation: 86600
The code doesn't seem to have anything clearly wrong, but filters of size (25,25)
may be somewhat not good.
There are two possibilities:
Subquestions:
1 - Yes, you're using filters that are windows sized (25,25) that slide along the input images. The bigger your filters, the less general they can be.
2 - The number 32 refers to how many output "channels" you want for this layer. While your input images have 3 channels, red layer, green layer and blue layer, these convolution layers will produce 32 different channels. The meaning of each channel is up to the hidden mathematics we can't see.
3 - It's normal to have "a lot" of convolutional layers, one over another. Some well known models have more than 10 convolutional layers.
4 - Generators produce batches with shape (batch_size,image_side1, image_side2, channels)
.
steps_per_epoch
is necessary because the generators used are infinite (so keras doesn't know when to stop) steps_per_epoch = total_images//batch_size
, so one epoch will use exactly all images. But you can play with these numbers as you wish steps_per_epoch
, that is up to the user) 5 - The image data generator, besides loading data from your folders and making the classes for you, is also a tool for data augmentation.
Upvotes: 2