Reputation: 11
My work is to get the saliency maps through a VGG-based network. But the mse loss won't decrease as I imagine. So I cannot find why my loss dont decrease. ps. training dataset is SALICON. Here's the output:
training epoch 1, loss value is 0.041423566639423
training epoch 2, loss value is 0.041423123329878
training epoch 3, loss value is 0.041430559009314
training epoch 4, loss value is 0.041424177587032
...
training epoch 20, loss value is 0.041416928172112
And I try to change the optimizer, learning rate, loss function, but no one works. Here's my codes:
def shuffle(photo, grdtr, shuffle=True):
idx = np.arange(0, len(photo))
if shuffle:
np.random.shuffle(idx)
photo_shuffle = [photo[i] for i in idx]
grdtr_shuffle = [grdtr[i] for i in idx]
return np.asarray(photo_shuffle), np.asarray(grdtr_shuffle)
if __name__ == '__main__':
# create the model
x = tf.placeholder(tf.float32, [None, 48, 64, 3])
y_ = tf.placeholder(tf.float32, [None, 48 * 64, 1])
h = tf.placeholder(tf.float32, [None, 48, 64, 1])
y = deepnn(x)
# define loss and optimizer
# cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=(tf.clip_by_value(y,1e-8,tf.reduce_max(y))), labels=y_))
y_ = tf.nn.softmax(y_, dim=1)
cross_entropy = tf.reduce_mean(tf.pow(y - y_, 2))
# cross_entropy = tf.reduce_mean(y_ * tf.log(y_ / y)) #KL
tf.summary.scalar('loss', cross_entropy)
train_step = tf.train.AdamOptimizer(learning_rate = 0.001, beta1 = 0.9, beta2 = 0.999).minimize(cross_entropy)
# do the training
with tf.Session() as sess:
...
# load the training data
photos = np.load('./data/npy/SALICON100/photos_queue.npy')
grdtrs = np.load('./data/npy/SALICON100/grdtrs_queue.npy')
photos = photos / 255.0
grdtrs = grdtrs / 255.0
EPOCH = 20
BATCH_SIZE = 20
TRAINING_SET_SIZE = 20
for j in range(EPOCH):
# photos, grdtrs = shuffle(photos, grdtrs, shuffle=False)
grdtrs = np.resize(grdtrs, [TRAINING_SET_SIZE, 48, 64, 20])
grdtrs = np.reshape(grdtrs,[TRAINING_SET_SIZE, 48 * 64, 20])
_, loss_value, pred_y = sess.run([train_step, cross_entropy,y],feed_dict={x: photos[:20], y_: grdtrs[:20]})
if (j + 1) % 1 == 0:
print('training epoch %d, loss value is %.15f' % (j + 1, loss_value))
np.save('./data/20_photos_test/net_output.npy', pred)
np.save('./data/20_photos_test/net_grdtrs.npy', grdtrs[:20])
# stop the queue threads and properly close the session
...
And I put some codes about the tensors in the sess:
x = tf.placeholder(tf.float32, [None, 48, 64, 3])
y_ = tf.placeholder(tf.float32, [None, 48 * 64, 1])
y = deepnn(x)
cross_entropy = tf.reduce_sum(tf.pow(y-y_sm,2))
Upvotes: 1
Views: 203
Reputation: 7148
In the code you posted you never actually run your train step. You need to call something along the line of sess.run(train_step, feed_dict)
to actually train your network. If you do not train your network the loss obviosly will not be reduced.
Also are you sure that you want to use softmax on your labels?
Upvotes: 1