Reputation: 33
This is the loss plot of WGAN-GP after training for 14000 iterations. My image size is 128 by 128. Though the loss plot seems to be converging, the generator loss at iteration 14000 is -26646 and critic loss is -249909.
Upvotes: 3
Views: 1626
Reputation: 1277
Batch Normalization in the discriminator breaks Wasserstein GANs with gradient penalty. The authors themselves advocate the usage of layer normalization instead, but this is clearly written in bold in their paper (https://papers.nips.cc/paper/7159-improved-training-of-wasserstein-gans.pdf). It is hard to say if there are other bugs in your code, but I urge you to thoroughly read the DCGAN and the Wasserstein GAN paper and really take notes on the hyperparameters. Getting them wrong really destroys the performance of the GAN and doing a hyperparameter search gets expensive quite quickly.
By the way transposed convolutions produce stairway artifacts in your output images. Use image resizing instead. For an indepth explanation of that phenomenon I can recommend the following resource (https://distill.pub/2016/deconv-checkerboard/).
This is an interesting find as well, which may help you: Accelerated WGAN update strategy with loss change rate balancing.
Upvotes: 3