Reputation: 25
I'm currently using cross_validation.cross_val_predict to obtain the predictions made by a LogisticRegression classifier. My question is: what percentage of the data makes up the training set and what percentage makes up the test set? Is it an 80%-20% split?
I checked the website and other questions on stackoverflow but did not find an answer to my question.
Upvotes: 2
Views: 580
Reputation: 76297
In the documentation for this function, it states for the cv
arg:
cv : cross-validation generator or int, optional, default: None A cross-validation generator to use. If int, determines the number of folds in StratifiedKFold if y is binary or multiclass and estimator is a classifier, or the number of folds in KFold otherwise. If None, it is equivalent to cv=3. This generator must include all elements in the test set exactly once. Otherwise, a ValueError is raised.
Upvotes: 1