yusuf
yusuf

Reputation: 3781

InvalidArgumentError on softmax in tensorflow

I have the following function:

def forward_propagation(self, x):
                # The total number of time steps
                T = len(x)
                # During forward propagation we save all hidden states in s because need them later.
                # We add one additional element for the initial hidden, which we set to 0
                s = tf.zeros([T+1, self.hidden_dim])
                # The outputs at each time step. Again, we save them for later.
                o = tf.zeros([T, self.word_dim])


                a = tf.placeholder(tf.float32)
                b = tf.placeholder(tf.float32)
                c = tf.placeholder(tf.float32)

                s_t = tf.nn.tanh(a + tf.reduce_sum(tf.multiply(b, c)))
                o_t = tf.nn.softmax(tf.reduce_sum(tf.multiply(a, b)))
                # For each time step...
                with tf.Session() as sess:
                        s = sess.run(s)
                        o = sess.run(o)
                        for t in range(T):
                                # Note that we are indexing U by x[t]. This is the same as multiplying U with a one-hot vector.
                                s[t] = sess.run(s_t, feed_dict={a: self.U[:, x[t]], b: self.W, c: s[t-1]})
                                o[t] = sess.run(o_t, feed_dict={a: self.V, b: s[t]})
                return [o, s]

self.U, self.V, and self.W are numpy arrays. I try to get softmax on

o_t = tf.nn.softmax(tf.reduce_sum(tf.multiply(a, b)))

graph, and it gives me error on this line:

o[t] = sess.run(o_t, feed_dict={a: self.V, b: s[t]})

The error is:

InvalidArgumentError (see above for traceback): Expected begin[0] == 0 (got -1) and size[0] == 0 (got 1) when input.dim_size(0) == 0
[[Node: Slice = Slice[Index=DT_INT32, T=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](Shape_1, Slice/begin, Slice/size)]]

How I am supposed to get softmax in tensorflow?

Upvotes: 2

Views: 707

Answers (1)

Till Hoffmann
Till Hoffmann

Reputation: 9877

The problem arises because you call tf.reduce_sum on the argument of tf.nn.softmax. As a result, the softmax function fails because a scalar is not a valid input argument. Did you mean to use tf.matmul instead of the combination of tf.reduce_sum and tf.multiply?

Edit: Tensorflow does not provide an equivalent of np.dot out of the box. If you want to compute the dot product of a matrix and a vector, you need to sum over indices explicitly:

# equivalent to np.dot(a, b) if a.ndim == 2 and b.ndim == 1
c = tf.reduce_sum(a * b, axis=1)

Upvotes: 2

Related Questions