Reputation: 1690
To learn more about deep learning and computer vision, I'm working on a project to perform lane-detection on roads. I'm using TFLearn as a wrapper around Tensorflow.
Background
The training inputs are images of roads (each image represented as a 50x50 pixel 2D array, with each element being a luminance value from 0.0 to 1.0).
The training outputs are the same shape (50x50 array), but represent the marked lane area. Essentially, non-road pixels are 0, and road pixels are 1.
This is not a fixed-size image classification problem, but instead a problem of detecting road vs. non-road pixels from a picture.
Problem
I've not been able to successfully shape my inputs/outputs in a way that TFLearn/Tensorflow accepts, and I'm not sure why. Here is my sample code:
# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).
# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.
network = input_data(shape=[None, 50, 50, 1])
network = conv_2d(network, 50, 50, activation='relu')
# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')
network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)
model = tflearn.DNN(network, tensorboard_verbose=1)
model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)
The error I receive is on the model.fit
call, with error:
ValueError: Cannot feed value of shape (1, 50, 50) for Tensor u'InputData/X:0', which has shape '(?, 50, 50, 1)'
I've tried reducing the sample input/output arrays to a 1D vector (with length 2500), but that leads to other errors.
I'm a bit lost with how to shape all this, any help would be greatly appreciated!
Upvotes: 4
Views: 794
Reputation: 11
The error states that you have conflicting tensor shapes, one of size 4 and the other of size 3. This is due to the input data (X) not being of shape [-1,50,50,1]. All that is needed here is to reshape X to the correct shape before feeding into your network.
# X = An array of training inputs (of shape (50 x 50)).
# Y = An array of training outputs (of shape (50 x 50)).
# "None" equals the number of samples in my training set, 50 represents
# the size of the 2D image array, and 1 represents the single channel
# (grayscale) of the image.
X = tensorflow.reshape(X, shape[-1, 50, 50, 1])
network = input_data(shape=[None, 50, 50, 1])
network = conv_2d(network, 50, 50, activation='relu')
# Does the 50 argument represent the output shape? Should this be 2500?
network = fully_connected(network, 50, activation='softmax')
network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001)
model = tflearn.DNN(network, tensorboard_verbose=1)
model.fit(X, Y, n_epoch=10, shuffle=True, validation_set=(X, Y), show_metric=True, batch_size=1)
Upvotes: 1
Reputation: 4250
Have a look at the imageflow wrapper for tensorflow, which converts a numpy array containing multiple images into a .tfrecords file, which is the suggested format for using tensorflow https://github.com/HamedMP/ImageFlow.
You have to install it using
$ pip install imageflow
Suppose your numpy array containing some 'k' images is k_images
and the corresponding k labels (one-hot-encoded) are stored in k_labels
, then creating a .tfrecords file with the name 'tfr_file.tfrecords' gets as simple as writing the line
imageflow.convert_images(k_images, k_labels, 'tfr_file')
Alternatively, Google's Inception model contains a code to read images in a folder assuming each folder represents one label https://github.com/tensorflow/models/blob/master/inception/inception/data/build_image_data.py
Upvotes: 1