Reputation: 75
I am training my CNN on 10K grayscale images which are resized to 50x50px, with 6 classes using a one hot encoder. When i train the model i get the model from an insanely high loss to down at about 190 (my lowest i could get) and a terrible accuracy of approximately 16%. When i make a prediction using 10 test images i receive a sequence of 10 numbers that are usually the same like [2 2 2 2 2 2 2 2 2 2 ] or [5 5 5 5 5 5 5 5 5 5] for example, but this is not always the case, sometimes it can just be a bunch of random numbers.
This is how i am encoding:
def load_data(TRAINING_DIR):
images = []
labels = []
directories = [d for d in os.listdir(TRAINING_DIR)
if os.path.isdir(os.path.join(TRAINING_DIR, d))]
# Need to sort these because
# floyd hum jumbled up the order
directories = sorted(directories, key=int)
# Traverse through each directory and make a list
# of files names if they end in the PNG format
for d in directories:
label_directory = os.path.join(TRAINING_DIR, d)
file_names = [os.path.join(label_directory, f)
for f in os.listdir(label_directory)
if f.endswith(".png")]
#Traverse through each file, add the image data
# and label to the 2 lists
for f in file_names:
images.append(skimage.data.imread(f))
labels.append(int(d))
return images, labels
images, labels = load_data(TRAINING_DIR)
images = np.array(images, object)
labels = np.array(labels, object)
# Convert labels into a one hot vector
labels = pd.get_dummies(labels)
print('imported...')
Then when i train the model it appears to be actually training because the loss is decreasing. But when i run an inference on the model, the predicted labels are in no way the same format as the one hot encoded? Here is my Session.
def train_network(x):
pred = convolutional_network(x)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = y, logits = pred))
train_op = tf.train.AdamOptimizer(learning_rate=0.085).minimize(loss)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer()) # Initialize all the variables
saver = tf.train.Saver()
time_full_start = time.clock()
print("RUNNING SESSION...")
for epoch in range(num_epochs):
epoch_loss = 0
time_epoch_start = time.clock()
i = 0
while i < len(images):
start = i
end = i+ batch_size
train_batch_x = images[start:end]
train_batch_y = labels[start:end]
print('Training...')
op , loss_value = sess.run([train_op, loss], feed_dict={x: train_batch_x, y: train_batch_y})
epoch_loss += loss_value
i += batch_size
print('Epoch : ', epoch+1, ' of ', num_epochs, ' - Loss for epoch: ', epoch_loss)
Here is an example of what im talking about. After the model is trained I feed 10 images of the first class, and receive mostly 5's back. Should the model not return something like [1,0,0,0,0,0,0,0,0,0] because thats what it was trained on?
Here is the jupyter notebook if you want to take a closer look of the code https://www.floydhub.com/arse123/projects/cnn-1/8/code/train_edge.ipynb
Upvotes: 2
Views: 519
Reputation: 982
It's happening because of correct_pred = tf.argmax(pred, 1)
, it's giving you class with the highest probability after the softmax. You can use predicted = sess.run(pred, feed_dict={x: test_images[0:10]})
. Now, you will get probability of each class for given images For example you might get [.1, .05, .05 , .6, .1, .1] for your 6 classes. You won't get [0,0,0,1,0,0].Now, argmax will give you index corresponding to .6.
Also, add :
pred_ = tf.nn.softmax(pred)
predicted = sess.run(pred_, feed_dict={x: test_images[0:10]})
Upvotes: 1