costisst
costisst

Reputation: 391

Tensorflow: My loss function produces huge number

I'm trying image inpainting using a NN with weights pretrained using denoising autoencoders. All according to https://papers.nips.cc/paper/4686-image-denoising-and-inpainting-with-deep-neural-networks.pdf

I have made the custom loss function they are using.

My set is a batch of overlapping patches (196x32x32) of an image. My input are the corrupted batches of the image, and the output should be the cleaned ones.

Part of my loss function is

dif_y = tf.subtract(y_xi,y_)

dif_norm = tf.norm(dif_y, ord = 'euclidean', axis = (1,2))

Where y_xi(196 x 1 x 3072) is the reconstructed clean image and y_ (196 x 1 x 3072) is the real clean image. So I actually I subtract all images from their corrupted version and I sum all those differences. I think it is normal to be a very big number.

train_step = tf.train.AdamOptimizer().minimize(loss)

The loss value begins at around 3*10^7 and is converging after 200 runs (I loop for 1000) at a close value. So my output image will be miles away from the original.

Edit: starts at 3.02391e+07 and converges to 3.02337e+07

Is there any way my loss value is correct? If so, how can I dramatically reduce it?

Thanks

Edit 2: My loss function

dif_y = tf.subtract(y,y_)
dif_norm = tf.norm(dif_y, ord = 'euclidean', axis = (1,2))
sqr_norm = tf.square(dif_norm)
prod = tf.multiply(sqr_norm,0.5)
sum_norm2 = tf.reduce_sum(prod,0)
error_1 = tf.divide(sum_norm2,196)

Upvotes: 3

Views: 5554

Answers (2)

costisst
costisst

Reputation: 391

Just for the record if anyone else has a similar problem: Remember to normalize your data! I was actually subtracting values in range [0,1] from values in range [0,255]. Very noobish mistake, I learned it the hard way!

Input values / 255

Expected values / 255

Problem solved.

Upvotes: 6

Anton Panchishin
Anton Panchishin

Reputation: 3763

sum_norm2 = tf.reduce_sum(prod,0) - I don't think this is doing what you want it to do.

Say y and y_ have values for 500 images and you have 10 labels for a 500x10 matrix. When tf.reduce_sum(prod,0) processes that you will have 1 value that is the sum of 500 values each which will be the sum of all values in the 2nd rank.

I don't think that is what you want, the sum of the error across each label. Probably what you want is the average, at least in my experience that is what works wonders for me. Additionally, I don't want a whole bunch of losses, one for each image, but instead one loss for the batch.

My preference is to use something like

loss = tf.reduce_mean ( tf.reduce_mean( prod ) )

This has the additional upshot of making your Optimizer parameters simple. I haven't run into a situation yet where I have to use anything other than 1.0 for the learning_rate for GradientDescent, Adam, or MomentumOptimizer.

Now your loss will be independent of batch size or number of labels.

Upvotes: 1

Related Questions