Reputation: 1616
I am getting the error
ValueError: k-fold cross-validation requires at least one train/test split by setting n_splits=2 or more, got n_splits=1
same question available ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1
for me
I used python 3.8.6 with scikit-learn==0.23.2, everything worked fine,
I updated to python 3.9.5 with scikit-learn==0.24.2, got this error, i have 191 samples in X_test. I am unsure why library version causing this issue.
Using cv=3 with total 1000 records dataset
Full Code
X_train, X_test, y_train, y_test = train_test_split(features,
labels,
test_size=0.2,
random_state=200)
smote_enn = SMOTEENN(sampling_strategy='all', random_state=127)
X_train_senn, y_train_senn = smote_enn.fit_resample(X_train, y_train)
lgbmclassifier = LGBMClassifier(boosting_type='gbdt',
max_depth=-1,
device_type=deviceType,
verbose=0,
objective='binary',
class_weight='balanced',
force_row_wise=True,
subsample_for_bin=200000,
min_child_samples=20,
random_state=50)
lgbmgrid = CVGrid(lgbmclassifier, FE_Hyperparamerters)
lgbmgrid_result = lgbmgrid.fit(X_train,
y_train,
eval_metric='auc',
eval_set=[(X_test, y_test)],
early_stopping_rounds=ESR,
verbose=1)
Error
File "C:\prg\utils.py", line 851, in feHPTuning
lgbmgrid_result = lgbmgrid.fit(X_train,
File "C:\Users\prg\Anaconda3\envs\automl_py395elk7120_2\lib\site-packages\sklearn\utils\validation.py", line 63, in inner_f
return f(*args, **kwargs)
File "C:\Users\prg\Anaconda3\envs\automl_py395elk7120_2\lib\site-packages\sklearn\model_selection\_search.py", line 762, in fit
cv_orig = check_cv(self.cv, y, classifier=is_classifier(estimator))
File "C:\Users\prg\Anaconda3\envs\automl_py395elk7120_2\lib\site-packages\sklearn\utils\validation.py", line 63, in inner_f
return f(*args, **kwargs)
File "C:\Users\prg\Anaconda3\envs\automl_py395elk7120_2\lib\site-packages\sklearn\model_selection\_split.py", line 2062, in check_cv
return StratifiedKFold(cv)
File "C:\Users\prg\Anaconda3\envs\automl_py395elk7120_2\lib\site-packages\sklearn\utils\validation.py", line 63, in inner_f
return f(*args, **kwargs)
File "C:\Users\prg\Anaconda3\envs\automl_py395elk7120_2\lib\site-packages\sklearn\model_selection\_split.py", line 636, in __init__
super().__init__(n_splits=n_splits, shuffle=shuffle,
File "C:\Users\prg\Anaconda3\envs\automl_py395elk7120_2\lib\site-packages\sklearn\utils\validation.py", line 63, in inner_f
return f(*args, **kwargs)
File "C:\Users\prg\Anaconda3\envs\automl_py395elk7120_2\lib\site-packages\sklearn\model_selection\_split.py", line 280, in __init__
raise ValueError(
ValueError: k-fold cross-validation requires at least one train/test split by setting n_splits=2 or more, got n_splits=1.
fit function creates the error
Upvotes: 0
Views: 3380
Reputation: 5304
This error is pretty straightforward. You cannot perform a Kfold
split with only 1 split.
The Kfold documentation states that n_splits
is the number of folds and must be at least 2.
If you want to perform only a single split you should use sklearn.model_selection.train_test_split
.
Upvotes: 1