Anubhav Singh
Anubhav Singh

Reputation: 8699

ValueError with 'MatMul' in tensorflow

I am a newbie in Tensorflow. I just started working on this machine learning techonlogy from TensorFlow official website only.I am trying to implement Softmax Regressions but getting following errors.

ValueError: Dimensions must be equal, but are 784 and 10 for 'MatMul' (op: 'MatMul') with input shapes: [?,784], [10,784].

Here is the complete code :

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist=input_data.read_data_sets("MNIST_data/",one_hot=True)

x=tf.placeholder(tf.float32,[None,784])
W=tf.Variable(tf.zeros([10,784]))
b=tf.Variable(tf.zeros([10]))
y=tf.nn.softmax(tf.matmul(x,W)+b)

y_=tf.placeholder(tf.float32,[None,10])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

sess=tf.InteractiveSession()
tf.global_variables_initializer().run()

for _ in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

Here is the output I am getting:

enter image description here

Thanks in advance.

Upvotes: 0

Views: 320

Answers (1)

ml4294
ml4294

Reputation: 2629

Note that with the definition of x one training sample will be a vector (x1, ...., x784), and the number of lines of x is given by the number of samples in a batch. With this in mind, the 'interesting' dimension is the number of columns, not the number of rows, as one might expect. Therefore, the vector x is multiplied from the left to the weight matrix W, resulting in a vector of shape (num_samples_per_batch, 10). In order to perform this multiplication from the left, you will have to switch the arguments of W as follows:

W=tf.Variable(tf.zeros([784,10]))

By the way: tf.nn.softmax_cross_entropy_with_logits() expects unscaled logits (https://www.tensorflow.org/api_docs/python/tf/nn/softmax_cross_entropy_with_logits), so you should not perform a tf.nn.softmax() before using this option. Therefore, I think it would be better to change

y=tf.nn.softmax(tf.matmul(x,W)+b)

to

y=tf.matmul(x,W)+b

and

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

to

correct_prediction = tf.equal(tf.argmax(tf.nn.softmax(y),1), tf.argmax(y_,1)).

This is the example with the modifications:

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist=input_data.read_data_sets("MNIST_data/",one_hot=True)

x=tf.placeholder(tf.float32,[None,784])
W=tf.Variable(tf.zeros([784, 10]))
b=tf.Variable(tf.zeros([10]))
# y=tf.nn.softmax(tf.matmul(x,W)+b)
y = tf.matmul(x,W) + b

y_=tf.placeholder(tf.float32,[None,10])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

sess=tf.InteractiveSession()
tf.global_variables_initializer().run()

for _ in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

correct_prediction = tf.equal(tf.argmax(tf.nn.softmax(y),1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

Upvotes: 1

Related Questions