Apoorv Patne
Apoorv Patne

Reputation: 889

How to train a CNN model on 2 classes of 100 samples each and then test it on 200 new samples?

I've got 2 classes for my training set: Birds(100 samples) and no_birds(100) samples. And, the test set is unlabelled consisting of 200 test samples (mixed with birds and no_birds). For every sample in the test set I intend to classify it as bird or no_bird using CNN with Keras.

import numpy as np
import keras
from keras import backend as K
from keras.models import Sequential
from keras.layers import Activation
from keras.layers.core import Dense, Flatten
from keras.optimizers import Adam
from keras.metrics import categorical_crossentropy
from keras.preprocessing.image import ImageDataGenerator
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import *
from sklearn.metrics import confusion_matrix
import itertools
import matplotlib.pyplot as plt

train_path = 'dataset/train_set'
test_path = 'dataset/test_set'

train_batches = ImageDataGenerator().flow_from_directory(train_path, target_size=(224,224), classes=['bird', 'no_bird'], batch_size=10) # bird directory consisting of 100 
test_batches = ImageDataGenerator().flow_from_directory(test_path, target_size=(224,224), classes=['unknown'], batch_size=10)

model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(224,224,3)),
    Flatten(),
    Dense(2, activation='softmax'),
])

model.compile(Adam(lr=.0001), loss='categorical_crossentropy', metrics=['accuracy'])

model.fit_generator(train_batches, steps_per_epoch=20, validation_data=test_batches, validation_steps=20, epochs=10, verbose=2)

Error I'm getting at the last step is this:

ValueError: Error when checking target: expected dense_1 to have shape (2,) but got array with shape (1,)

Now, I know it could be probably because of test_set having only 1 directory, since it's unlabelled. Correct me if I'm wrong. What should I do to make this work?

Upvotes: 0

Views: 839

Answers (2)

Mohammad Athar
Mohammad Athar

Reputation: 1980

the line test_batches = ImageDataGenerator().flow_from_directory(test_path, target_size=(224,224), classes=['unknown'], batch_size=10) is wrong

you should do test_batches = ImageDataGenerator().flow_from_directory(test_path, target_size=(224,224), classes=['bird', 'no_bird'], batch_size=10) still. That way you can score your predictions

folowup information:

when you look at https://keras.io/models/sequential/, it says

validation_data: tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. This will override validation_split.

Your test data must be the same shape as your train data. You'll have to organize the test data directory so it's structured the same as the training data

Upvotes: 0

HMK
HMK

Reputation: 608

It seems your test set is unlabelled. Remove validation arguments from model.fit. It should be:

model.fit_generator(train_batches, steps_per_epoch=20, epochs=10, verbose=2)

You can't validate without labels.

Upvotes: 1

Related Questions