Reputation: 663
I am currently developing a program in Tensorflow that reads data 1750 by 1750 pixels. I ran it through a convolutional network:
import os
import sys
import tensorflow as tf
import Input
FLAGS = tf.app.flags.FLAGS
tf.app.flags.DEFINE_integer('batch_size', 100, "hello")
tf.app.flags.DEFINE_string('data_dir', '/Volumes/Machine_Learning_Data', "hello")
def inputs():
if not FLAGS.data_dir:
raise ValueError('Please supply a data_dir')
data_dir = os.path.join(FLAGS.data_dir, 'Data')
images, labels = Input.inputs(data_dir = data_dir, batch_size = FLAGS.batch_size)
return images, labels
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape = shape)
return tf.Variable(initial)
def conv2d(images, W):
return tf.nn.conv2d(images, W, strides = [1, 1, 1, 1], padding = 'SAME')
def max_pool_5x5(images):
return tf.nn.max_pool(images, ksize = [1, 5, 5, 1], strides = [1, 1, 1, 1], padding = 'SAME')
def forward_propagation(images):
with tf.variable_scope('conv1') as scope:
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
image_matrix = tf.reshape(images, [-1, 1750, 1750, 1])
h_conv1 = tf.nn.sigmoid(conv2d(image_matrix, W_conv1) + b_conv1)
h_pool1 = max_pool_5x5(h_conv1)
with tf.variable_scope('conv2') as scope:
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.sigmoid(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_5x5(h_conv2)
with tf.variable_scope('conv3') as scope:
W_conv3 = weight_variable([5, 5, 64, 128])
b_conv3 = bias_variable([128])
h_conv3 = tf.nn.sigmoid(conv2d(h_pool2, W_conv3) + b_conv3)
h_pool3 = max_pool_5x5(h_conv3)
with tf.variable_scope('local3') as scope:
W_fc1 = weight_variable([10 * 10 * 128, 256])
b_fc1 = bias_variable([256])
h_pool3_flat = tf.reshape(h_pool3, [-1, 10 * 10 * 128])
h_fc1 = tf.nn.sigmoid(tf.matmul(h_pool3_flat, W_fc1) + b_fc1)
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
W_fc2 = weight_variable([256, 4])
b_fc2 = bias_variable([4])
y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
return y_conv
def error(forward_propagation_results, labels):
labels = tf.cast(labels, tf.float32)
mean_squared_error = tf.square(tf.sub(labels, forward_propagation_results))
cost = tf.reduce_mean(mean_squared_error)
train = tf.train.GradientDescentOptimizer(learning_rate = 0.3).minimize(cost)
return train
print cost
Unfortunately an error has popped up
Incompatible shapes for broadcasting: TensorShape([Dimension(100)]) and TensorShape([Dimension(9187500), Dimension(4)])
and I have not been able to debug this.
What is the issue with the matrix dimensions? The interprer says the error occurred at the tf.sub line.
Edit:
This is the main part of the code where the functions are called.
import Input
import Process
import tensorflow as tf
def train():
with tf.Session() as sess:
images, labels = Process.inputs()
forward_propgation_results = Process.forward_propagation(images)
train_loss = Process.error(forward_propgation_results, labels)
init = tf.initialize_all_variables()
sess.run(init)
def main(argv = None):
train()
if __name__ == '__main__':
tf.app.run()
Upvotes: 0
Views: 9926
Reputation: 984
I've found the following problems:
Your labels
input is a simple 1-dimensional array of label identifiers, but it needs to be one-hot encoded to be a matrix with size [batch_size, 4]
that's filled with either 1s or 0s.
Your max pooling operation needs to have strides that are different from 1 to actually reduce the width and height of the image. So setting strides=[1, 5, 5, 1]
should work.
After fixing that, your max pooling operations don't actually bring down the width/height from 1750 to 10 as you're assuming, but only to 14 (because 1750 / 5 / 5 / 5 == 14
). So you probably want to increase your weight matrix here, but there are other options as well.
Is it possible that your images start out with 3 channels? You're assuming grayscale here, so you should either reshape image_matrix
to have 3 channels, or convert the images to grayscale.
After applying these fixes, both the network output and the labels should have shape [batch_size, 4]
and you should be able to calculate the difference.
Edit: I've adjusted this after discussing the code in the chat below.
Upvotes: 2
Reputation: 171
One_hot labeling add dimension to its input. As an example if the labels
tensor if of size [batch,1], using tf.one_hot(batch_labels, depth=2, axis=-1)
returns a [batch,1,2] dimension tensor. For the case of size [batch_size,1] for labels
tensor the following script can be the solution to get rid of the extra dimension:
tf.one_hot(tf.squeeze(batch_labels,[1]), depth=2, axis=-1)
Basically the labels
tensor must be of size [batch_size,]. The tf.squeeze() function, eliminate specific dimensions. The [1] argument, prompt the function to eliminate the second dimension which is 1
.
Upvotes: 0