Scikit learn cross validation split

Question

I'm currently using cross_validation.cross_val_predict to obtain the predictions made by a LogisticRegression classifier. My question is: what percentage of the data makes up the training set and what percentage makes up the test set? Is it an 80%-20% split?

I checked the website and other questions on stackoverflow but did not find an answer to my question.

Ami Tavory · Accepted Answer

In the documentation for this function, it states for the cv arg:

cv : cross-validation generator or int, optional, default: None A cross-validation generator to use. If int, determines the number of folds in StratifiedKFold if y is binary or multiclass and estimator is a classifier, or the number of folds in KFold otherwise. If None, it is equivalent to cv=3. This generator must include all elements in the test set exactly once. Otherwise, a ValueError is raised.

Scikit learn cross validation split

Answers (1)

Related Questions