Some simple questions regarding the training of CNNs

Question

I read that when using CNNs, we should have approximately equal number of samples per class. I am doing binary classification, detecting pedestrians from background so the 2 classes are pedestrian and background (anything not pedestrian really).

If I were to incorporate hard negative mining in my training, I would end up with more negative samples than positive if I am getting a lot of false positives.

1) Would this be okay?

2) If not then how do I solve this issue?

3) And what are the consequences of training a CNN with more negative than positive samples?

4) If it is okay to have more negative than positive samples, is there a maximum limit that I should not exceed ? Like for eg. I should not have 3x more negative samples than positives.

5) I can augment my positive samples by jittering but how much additional samples per image should I create? Is there a 'too much'? Like if I start off with 2000 positive samples, how much additional samples is too much? Is generating a total of 100k samples from the 2k samples via jittering too much?

Some simple questions regarding the training of CNNs

Answers (1)

Related Questions