Tensorflow Convolution return Nan

Question

I'm trying to read image from folder and perform convolution.

First, I input these images and package them as .tfrecords, and decode the tfrecords by tf.train.batch. Next, I put all data(image,label) into convolution (as code). In this step,the bias(b_conv1) and weight(w_conv1) will be Nan, and the model won't work any more.

image_batch, label_batch = decodeTFRecord(dirPath,batch_size)
image_batch = tf.reshape(image_batch,[-1,128*128])
label_batch = tf.one_hot(label_batch,Label_size)

x = tf.placeholder(tf.float32,[None,128*128])
y = tf.placeholder(tf.float32,[None,10])
x_image = tf.convert_to_tensor(tf.reshape(x,[-1,128,128,1]))
#conv1 layer
w_conv1 = weight_variable([5,5,1,32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image,w_conv1)+b_conv1) #outsize = 128*128*32
h_pool1 = max_pool_2x2(h_conv1) # outsize = 64*64*32

conv2d function:

def conv2d(x,W):
   return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding ='SAME')

max_pool_2x2 function:

def max_pool_2x2(x):
    return tf.nn.max_pool(x,ksize=[1,2,2,1], strides=[1,2,2,1],padding='SAME')

Full Code: https://codeshare.io/5O7ddj

David Parks · Accepted Answer

Try a smaller learning rate to start with 1e-5 and make your initial weights smaller tf.truncated_normal(shape, stddev = 0.0001) and see if either of those common issues fix your problem.

Based on comments it sounds like one of these two common issues caused the NaN problem (please comment if I misread your comment).

This issue often occurs when the weights are randomly initialized because large weights will have a long way to travel to improve, which can result in a very steep gradient step, e.g. the exploding gradient problem. Small weights/learning rates will ameliorate this issue.

Also notable is that BatchNorm will tend to ameliorate the problem as well. You can generally get away with much larger learning rates precisely because BatchNorm keeps things from getting really out of whack as the signal travels through the network.

Tensorflow Convolution return Nan

Answers (1)

Related Questions