Yes92
Yes92

Reputation: 311

Tensorflow convolution

I'm trying to perform a convolution (conv2d) on images of variable dimensions. I have those images in form of an 1-D array and I want to perform a convolution on them, but I have a lot of troubles with the shapes. This is my code of the conv2d:

tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='SAME')

where x is the input image. The error is:

ValueError: Shape must be rank 4 but is rank 1 for 'Conv2D' (op: 'Conv2D') with input shapes: [1], [5,5,1,32].

I think I might reshape x, but I don't know the right dimensions. When I try this code:

x = tf.reshape(self.x, shape=[-1, 5, 5, 1]) # example

I get this:

ValueError: Dimension size must be evenly divisible by 25 but is 1 for 'Reshape' (op: 'Reshape') with input shapes: [1], [4] and with input tensors computed as partial shapes: input[1] = [?,5,5,1].

Upvotes: 2

Views: 830

Answers (1)

Maxim
Maxim

Reputation: 53758

You can't use conv2d with a tensor of rank 1. Here's the description from the doc:

Computes a 2-D convolution given 4-D input and filter tensors.

These four dimensions are [batch, height, width, channels] (as Engineero already wrote).

If you don't know the dimensions of the image in advance, tensorflow allows to provide a dynamic shape:

x = tf.placeholder(tf.float32, shape=[None, None, None, 3], name='x')
with tf.Session() as session:
    print session.run(x, feed_dict={x: data})

In this example, a 4-D tensor x is created, but only the number of channels is known statically (3), everything else is determined on runtime. So you can pass this x into conv2d, even if the size is dynamic.

But there's another problem. You didn't say your task, but if you're building a convolutional neural network, I'm afraid, you'll need to know the size of the input to determine the size of FC layer after all pooling operations - this size must be static. If this is the case, I think the best solution is actually to scale your inputs to a common size before passing it into a convolutional network.

UPD:

Since it wasn't clear, here's how you can reshape any image into 4-D array.

a = np.zeros([50, 178, 3])
shape = a.shape
print shape    # prints (50, 178, 3)
a = a.reshape([1] + list(shape))
print a.shape  # prints (1, 50, 178, 3)

Upvotes: 3

Related Questions