Reputation: 27
Hi I'm new in neuralNetworks with tensorflow. I've taken a small fraction of the spaces365 dataset. I want to make a neural network to classify betweeen 10 places.
For that I've tried to do a small copy of a vgg network. The problem I have is that at the output of the softmax function I get a one-hot encoded array. Looking for problems in my code, I've realised that the output of relu functions are either 0 or a big number (around 10000).
I don't know where I'm wrong. Here it's my code:
def variables(shape):
return tf.Variable(2*tf.random_uniform(shape,seed=1)-1)
def layerConv(x,filter):
return tf.nn.conv2d(x,filter, strides=[1, 1, 1, 1], padding='SAME')
def maxpool(x):
return tf.nn.max_pool(x,[1,2,2,1],[1,2,2,1],padding='SAME')
weights0 = variables([3,3,1,16])
l0 = tf.nn.relu(layerConv(input,weights0))
l0 = maxpool(l0)
weights1 = variables([3,3,16,32])
l1 = tf.nn.relu(layerConv(l0,weights1))
l1 = maxpool(l1)
weights2 = variables([3,3,32,64])
l2 = tf.nn.relu(layerConv(l1,weights2))
l2 = maxpool(l2)
l3 = tf.reshape(l2,[-1,64*32*32])
syn0 = variables([64*32*32,1024])
bias0 = variables([1024])
l4 = tf.nn.relu(tf.matmul(l3,syn0) + bias0)
l4 = tf.layers.dropout(inputs=l4, rate=0.4)
syn1 = variables([1024,10])
bias1 = variables([10])
output_pred = tf.nn.softmax(tf.matmul(l4,syn1) + bias1)
error = tf.square(tf.subtract(output_pred,output),name='error')
loss = tf.reduce_sum(error, name='cost')
#TRAINING
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train = optimizer.minimize(loss)
The input of the neural netWork is a normalized grayscale image of 256*256 pixels. The learning Rate is 0.1 and the Batch Size is 32.
Thank you in advance!!
Upvotes: 1
Views: 831
Reputation: 7148
The problem you have is your weight initialization. NN are highly complicated non-convex optimization problems. Therefore, a good init is paramount to getting any good results. If you use ReLUs you should use the Initialization proposed by He et al. (https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf?spm=5176.100239.blogcont55892.28.pm8zm1&file=He_Delving_Deep_into_ICCV_2015_paper.pdf).
In Essence the initialization of your network should be initialized with iid gaussian distributed values with mean 0 and standard deviation as follows:
stddev = sqrt(2 / Nr_input_neurons)
Upvotes: 1
Reputation: 7399
Essentially what reLu is :
def relu(vector):
vector[vector < 0] = 0
return vector
and softmax:
def softmax(x):
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum(axis=0)
The output of softmax being a one-hot encoded array means there is a problem and that could be many things.
You can try reducing the learning_rate for starters, you can use 1e-4
/ 1e-3
and check. If it doesn't work, try adding some regularization. I am also skeptical about your weight initialization.
Regulatization : This is a form of regression, that constrains/ regularizes or shrinks the coefficient estimates towards zero. In other words, this technique discourages learning a more complex or flexible model, so as to avoid the risk of overfitting. - Regularization in ML
Link to : Build a multilayer neural network with L2 regularization in tensorflow
Upvotes: 2