Reputation: 21
I have a situation where:
I think this is a classical case of overfitting.
As per my knowledge, I can use regularization. I have read cross validation will also helps in solving my over fitting problem.
Some inquiries I have regarding this:
Upvotes: 1
Views: 901
Reputation: 501
I think you are confused on what exactly cross validation is. I will link to OpenML's explanation for 10-fold cross validation so you get a better idea.
Over-fitting occurs normally when there is not enough data for your model to train on, resulting in it learning patterns/similarities between the data set that is not helpful, such as putting too much focus on outlying data that would be ignored if given a larger data set.
Now to your questions:
1-2. Cross-validation is just one solution that is helpful for preventing/solving over-fitting. Through partitioning the data set into k-sub groups, or folds, you then can train your model on k-1 folds. The last fold will be used as your unseen validation data to test your model upon. This will sometimes help prevent over-fitting. A factor in this working though depends on how long/how many epochs you are training your data for. Since you said you have a relatively small data set, you want to make sure you aren't 'over-learning' on this data. Implementing cross-validation will not do you much good if you are training for hundreds/thousands of epochs on a really small data set.
The biggest problem, and you said it yourself in the comments, is you don't have a lot of data. The best, although not always the easiest way, is to increase your data size so your model won't learn unimportant tendencies and put too much focus on the outliers.
I will link to a website that is incredibly helpful in explaining the problems of over-fitting and gives a variety of ways to attempt to overcome this problem.
Let me know if I was of help!
Upvotes: 1