Why is my generator and discriminator loss converging at higher values in WGAN-GP?

Question

This is the loss plot of WGAN-GP after training for 14000 iterations. My image size is 128 by 128. Though the loss plot seems to be converging, the generator loss at iteration 14000 is -26646 and critic loss is -249909.

Loss plot

joe hoeller · Accepted Answer

Batch Normalization in the discriminator breaks Wasserstein GANs with gradient penalty. The authors themselves advocate the usage of layer normalization instead, but this is clearly written in bold in their paper (https://papers.nips.cc/paper/7159-improved-training-of-wasserstein-gans.pdf). It is hard to say if there are other bugs in your code, but I urge you to thoroughly read the DCGAN and the Wasserstein GAN paper and really take notes on the hyperparameters. Getting them wrong really destroys the performance of the GAN and doing a hyperparameter search gets expensive quite quickly.

By the way transposed convolutions produce stairway artifacts in your output images. Use image resizing instead. For an indepth explanation of that phenomenon I can recommend the following resource (https://distill.pub/2016/deconv-checkerboard/).

This is an interesting find as well, which may help you: Accelerated WGAN update strategy with loss change rate balancing.

Why is my generator and discriminator loss converging at higher values in WGAN-GP?

Answers (1)

Related Questions