Reputation: 15
So here is my code for a full understanding: https://hastebin.com/qigimomika.py .
So basically I have a problem with the following lines:
a bit context:
def weight_variable(shape):
initial = tensorflow.truncated_normal(shape, stddev=0.01)
return tensorflow.Variable(initial)
def bias_variable(shape):
initial = tensorflow.constant(0.01, shape=shape)
return tensorflow.Variable(initial)
w_layer1 = weight_variable([4, 32])
b_layer1 = bias_variable([32])
input_layer = tensorflow.placeholder("float", [4])
the line which produces the error:
h_layer1 = tensorflow.add(tensorflow.matmul(input_layer, w_layer1),b_layer1)
When I run the whole code (which is above) it produces the following ValueError
ValueError: Shape must be rank 2 but is rank 1 for 'MatMul' (op: 'MatMul')
with input shapes: [4], [4,32].
Now my question: What happens and how can I avoid this ?
Thanks for your attention
EDIT: thanks to Prune and Ali Abbasi.
My solution: I changed the input_layer to:
input_layer = tensorflow.placeholder("float", [1, 4])
The problem was that my first array was tensorflow rank 1 ([4]) and my second array rank 2 ([4, 32]). So I added this line:
state = [state]
whereby state the input is:
output_layer.eval(feed_dict={input_layer : state})
state was initially [1, 2, 3, 4] (rank 1), now it is [[1, 2, 3, 4]] (rank 2).
thanks
EDIT2: Ok I changed a lot since the last EDIT. I got lost of the changes to record them. In case you want to see my code here it is. I know it's messy as fuck. For now I am just soo happy that shit is working :"D. You will not be able to understand my code it's a total mess but I just wanted to document the current state. A big thanks to Ali Abbasi :D.
Upvotes: 0
Views: 299
Reputation: 974
Here we are since you know MatMul
operations is classic Matrix Multiplication operation so if we want to multiply two matrices, M1
and M2
, with shape AxB
and BxC
respectively, we should have same shapes B
when we want to multiplication and in result:
M1 x M2
results in a matrix R
with shape AxC
.
So in your case you try to multiplication two matrices with shape 4x1
and 4x32
, so it throws an error of shape problem, you should transpose the first tensor, then you have:
1x4
MatMul 4x32
result in 1x32
matrix.
Your code here:
h_layer1 = tensorflow.add(tensorflow.matmul(input_layer, w_layer1),b_layer1)
Use like this:
h_layer1 = tensorflow.add(tensorflow.matmul(tf.transpose(input_layer), w_layer1),b_layer1)
For more detailed answer, you can print the shapes of tensors in each stage like:
print h_layer1.get_shape()
And see the shapes, then you can modify your shapes and inputs.
Good Luck.
Upvotes: 1