SRobertJames
SRobertJames

Reputation: 9263

Two layer neural network performs worse than single layer

I'm learning TensorFlow, and trying to create a simple two layer neural network.

The tutorial code https://www.tensorflow.org/get_started/mnist/pros starts with this simple network, to get 92% accuracy:

W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)

I tried replacing it with this very simple network, adding a new layer, but accuracy now drops to 84%!!!

layer1_len = 10
w1 = weight_var([784, layer1_len])
b1 = bias_var([layer1_len])
o1 = tf.nn.relu(tf.matmul(x, w1) + b1)
w2 = weight_var([layer1_len, 10])
b2 = bias_var([10])
y = tf.nn.softmax(tf.matmul(o1, w2) + b2)

I get that result with several different values for layer1_len as well as different numbers of training steps. (Note that if I omit the weight_var and bias_var random initialization, and keep everything at zero, accuracy drops to close to 10%, essentially no better than guessing.)

What am I doing wrong?

Upvotes: 2

Views: 945

Answers (1)

Salvador Dali
Salvador Dali

Reputation: 222811

There is nothing wrong. The problem is that increasing layers does not automatically means a higher accuracy (otherwise machine learning would be kind of solved, because if you need a better accuracy in an image classifier you would just add +1 layer to an inception and claim a victory).

To show you that this is not only your problem - take a look at this high-level paper: Deep Residual Learning for Image Recognition where they see that increasing the number of layers decreases the scoring function (which is not important) and their architecture to overcome this problem (which is important). Here is a small part from it:enter image description here

The deeper network has higher training error and thus test error.

Upvotes: 5

Related Questions