Reputation: 1652
Can you have TOO MUCH training data or not? I am working on a system that will update training data when a user gives it feedback of a mistake it has made in an attempt to not make the same mistake again (i.e if the user looks a little different to their usual training images, it will add the new capture of them to training data). Will this decrease performance at all? Should there be a maximum? Would it be better just to have the same training set and just accept the fail rate instead of trying to improve it?
Cheers!
Upvotes: 1
Views: 596
Reputation: 518
Depending on how different the user looks, this could be a problem. lets say the user is wearing sunglasses, looks the wrong way,and wears a scarf. This would occlude too much of the image to properly determine if this is a face or not. Training on such images would provide horrendous results overall, because they are not something that qualifies as a face, or at least not according to the theories provided for eigenfaces. If you want to keep training a model according to feedback, I think you should at least have a person check the images and decide if they are worth training.
But, if you have trained the model with a proper dataset to begin with, almost all the feedback you would receive would never properly qualify as a face. because if they did, the model would not have failed in the first place.
regarding a maximum, If I recall correctly, there is not a hard limit you should respect, but up to a certain point, the amount of time needed to retrain the model would become absurtly long, which could be unwanted for your specific situation.
I hope this made any sense to you, If you have any more questions about my answer, just leave a comment.
Upvotes: 1