Reputation: 1
As an example, I can cross validation when I do hyperparameter tuning (GridSearchCV). I can select the best estimator from there and do RFECV. and I can perform cross validate again. But this is a time-consuming task. I'm new to data science and still learning things. Can an expert help me lean how to use these things properly in machine learning model building?
I have time series data. I'm trying to do hyperparameter tuning and cross validation in a prediction model. But it is taking a long time run. I need to learn the most efficient way to do these things during the model building process.
Upvotes: 0
Views: 188
Reputation: 86
Cross-validation is a tool in order to evaluate model performance. Specifically avoid over-fitting. When we put all the data in training side, your Model will get over-fitting by ignoring generalisation of the data.
The concept of turning parameter should not based on cross-validation because hyper-parameter should be changed based on model performance, for example the depth of tree in a tree algorithm….
When you do a 10-fold cv, you will be similar to training 10 model, of cause it will have time cost. You could tune the hyper-parameter based on the cv result as cv-> model is a result of the model. However it does not make sense when putting the tuning and do cv to check again because the parameter already optimised based on the first model result.
P.s. if you are new to data science, you could learn something call regularization/dimension reduction to lower the dimension of your data in order to reduce time cost.
Upvotes: 1