Reputation: 105
I'm training a keras deep learning model with 3 fold cross validation. For every fold I'm receiving a best performing model and in the end my algorithm is giving out the combined score of the three best models. My question now is, if there is a possibility to combine the 3 models in the end or if it would be a legit solution to just take the best performing model of those 3 models?
Upvotes: 1
Views: 618
Reputation: 14983
A (more) correct reflection of your performance on your dataset would be to average the N fold-results on your validation set.
As per the three resulting models, you can have an average prediction (voting ensemble) for a new data point. In other words, whenever a new data point arrives, predict with all your three models and average the results.
Please note a very important thing: The purpose of K-fold cross-validation is model checking, not model building. By using K-fold cross-validation you ensure that when you randomly split your data, say in an 80-20 percent fashion, you do not create a very easy test set. Creating a very easy test set would lead the developer to consider that he/she has a very good model, and when subjected to test data the model would perform much worse.
In essence and eventually, what you would want to do is to take all the data that you are using for both train and test and using it only for training.
Upvotes: 1