How should I properly use KID Score (FID Score)

Question

for my final master degree project I am trying to do Data Augmentation in a dataset of thermal images (black and white) to detect breast cancer. This datasets contains only 280 images, that can be made into 2475 with some Data Augmentation techniques and another student last year built a model that reached up to 94% accuracy.

My teachers told me that it would be interesting to try Data Augmentation with GANs in this dataset, so maybe the model improves. I built a StyleGAN model and a training function to investigate how the GAN is learning and when it reaches an overfitting point (or maybe a point where the KID/FID is not improving anymore).

The thing is, I dont know how to properly implement KID Score in the training (the non-biased version of FID Score), because I have read in some papers that you should at least generate between 10k and 50k images for the function to have a correct KID value but my GPU cannot store that much data and stops the execution.

In the early drafts of the training function, I tried to calculate KID Score like this:

# KID Score
KID = KernelInceptionDistance(subsets = 64, subset_size = 20, normalize = True).to(device)
kid_score = []
            
# Saving KID Score
KID.update(real_imgs.repeat(1, 3, 1, 1), real=True)
KID.update(gen_imgs.repeat(1, 3, 1, 1), real=False)
kid_score.append(KID.compute()[0].item())

I am using the torchmetrics.image.kid.KKernelInceptionDistance implementation (documentation here) because it is unbiased in small dataset. Also, in the code above, real_imgs is a batch of real images and gen_imgs a batch of generated images. I save KID Score every 10 epochs and run the training for 600 epochs. Because I am currently training a Style GAN, real_imgs and gen_imgs have a size of 128, 64, 32 and 16 depending on the current state of the training (progressive growing of StyleGAN).

I read the paper "Training Generative Adversarial Networks with Limited Data" and they indicate that you should generate at least 10k or 50k images; I dont know if it referes to each epoch (I dont think so...) but I though I should generate more, so I tried to generate 1200 images each epoch and feed the KID Score variable but my training won't finish even with 10 checkpoints (only 10 KID Score readings) because of out of memory GPU.

To clarify even more this situation, i'm going to show an image of the KID Score during training:

KID Score in my training function - 600 epochs every 10 epochs

My question here are:

Is this is a correct way of calculating KID Score to see the overfitting or convergence of my GAN? Or should I take another approach?
Are the KID Score values correct? Or should they have another distributions? I think maybe KID Score its too small and I dont know if this is normal or not (maybe it is because of the "repeat" for the B&W images -> real_imgs.repeat(1, 3, 1, 1))

Thak you so much in advance!

How should I properly use KID Score (FID Score)

Answers (0)

Related Questions