j. cool
j. cool

Reputation: 15

Why am I getting this ValueError?

So here is my code for a full understanding: https://hastebin.com/qigimomika.py .

So basically I have a problem with the following lines:

a bit context:

def weight_variable(shape):
    initial = tensorflow.truncated_normal(shape, stddev=0.01)
    return tensorflow.Variable(initial)

def bias_variable(shape):
    initial = tensorflow.constant(0.01, shape=shape)
    return tensorflow.Variable(initial)

w_layer1 = weight_variable([4, 32])
b_layer1 = bias_variable([32])

input_layer = tensorflow.placeholder("float", [4])

the line which produces the error:

h_layer1 = tensorflow.add(tensorflow.matmul(input_layer, w_layer1),b_layer1)

When I run the whole code (which is above) it produces the following ValueError

ValueError: Shape must be rank 2 but is rank 1 for 'MatMul' (op: 'MatMul') 
            with input shapes: [4], [4,32].

Now my question: What happens and how can I avoid this ?

Thanks for your attention

EDIT: thanks to Prune and Ali Abbasi.

My solution: I changed the input_layer to:

input_layer = tensorflow.placeholder("float", [1, 4])

The problem was that my first array was tensorflow rank 1 ([4]) and my second array rank 2 ([4, 32]). So I added this line:

state = [state]

whereby state the input is:

output_layer.eval(feed_dict={input_layer : state})

state was initially [1, 2, 3, 4] (rank 1), now it is [[1, 2, 3, 4]] (rank 2).

thanks

EDIT2: Ok I changed a lot since the last EDIT. I got lost of the changes to record them. In case you want to see my code here it is. I know it's messy as fuck. For now I am just soo happy that shit is working :"D. You will not be able to understand my code it's a total mess but I just wanted to document the current state. A big thanks to Ali Abbasi :D.

Upvotes: 0

Views: 299

Answers (1)

Ali
Ali

Reputation: 974

Here we are since you know MatMul operations is classic Matrix Multiplication operation so if we want to multiply two matrices, M1 and M2, with shape AxB and BxC respectively, we should have same shapes B when we want to multiplication and in result:

M1 x M2 results in a matrix R with shape AxC.

So in your case you try to multiplication two matrices with shape 4x1 and 4x32, so it throws an error of shape problem, you should transpose the first tensor, then you have:

1x4 MatMul 4x32 result in 1x32 matrix.

Your code here:

h_layer1 = tensorflow.add(tensorflow.matmul(input_layer, w_layer1),b_layer1)

Use like this:

h_layer1 = tensorflow.add(tensorflow.matmul(tf.transpose(input_layer), w_layer1),b_layer1)

For more detailed answer, you can print the shapes of tensors in each stage like:

print h_layer1.get_shape()

And see the shapes, then you can modify your shapes and inputs.

Good Luck.

Upvotes: 1

Related Questions