Reputation: 161
I'm training a custom short network with Keras (2.1.6) and Tensorflow (1.4.0) as backend. While training, I use the tensorboard callback as:
tensorboard = keras.callbacks.TensorBoard(
log_dir=OUTPUT_PATH,
histogram_freq=EPOCH_STEPS,
batch_size=BATCH_SIZE,
write_grads=True)
This produces the expected results, but when I lok at the gradients distributions on TensorBoard, I see weird things on the graphs, which repeat at the same step of the histogram_freq
variable.
For example, for histogram_freq=1
and a convolution layer with 1 kernel (1,1) the distributions are:
In both images you can see spikes with interval 1. As additional information, the network being trained works on images of resolution 320x200 and the output is a full image 320x200 which get's compared with it's label (segmentation). Maybe the problem is that?
Upvotes: 0
Views: 162
Reputation: 4868
A wild guess, but looks like the gradients go crazy at the start of each epoch, so maybe you accidentally run tf.global_variables_initializer()
at the beginning of every epoch?
Do the weight distributions show the same pattern?
Upvotes: 0