Reputation: 1839
I am trying to run the following binary classification Model with XGBoost. X has the features, and y has the binary classification.
X_trains, X_tests, y_trains, y_tests = train_test_split(X, y, test_size=0.05, random_state=56)
xgb_model = xgb.XGBClassifier(max_depth=5, learning_rate=0.08, objective= 'binary:logistic',n_jobs=-1).fit(X_trains, y_trains)
print('Accuracy of XGB classifier on training set: {:.2f}'
.format(xgb_model.score(X_trains, y_trains)))
print('Accuracy of XGB classifier on test set: {:.2f}'
.format(xgb_model.score(X_tests[X_trains.columns], y_tests)))
I am getting a memory error, but only for 756 KiB. My laptop has far more RAM than that, so why is this happening? If it helps, I am running this in Jupyter Notebook.
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
in
----> 1 X_trains, X_tests, y_trains, y_tests = train_test_split(X, y, test_size=0.05, random_state=56)
2 xgb_model = xgb.XGBClassifier(max_depth=5, learning_rate=0.08, objective= 'binary:logistic',n_jobs=-1).fit(X_trains, y_trains)
3 print('Accuracy of XGB classifier on training set: {:.2f}'
4 .format(xgb_model.score(X_trains, y_trains)))
5 print('Accuracy of XGB classifier on test set: {:.2f}'
C:\Python38\lib\site-packages\sklearn\model_selection\_split.py in train_test_split(*arrays, **options)
2150 random_state=random_state)
2151
-> 2152 train, test = next(cv.split(X=arrays[0], y=stratify))
2153
2154 return list(chain.from_iterable((_safe_indexing(a, train),
C:\Python38\lib\site-packages\sklearn\model_selection\_split.py in split(self, X, y, groups)
1339 """
1340 X, y, groups = indexable(X, y, groups)
-> 1341 for train, test in self._iter_indices(X, y, groups):
1342 yield train, test
1343
C:\Python38\lib\site-packages\sklearn\model_selection\_split.py in _iter_indices(self, X, y, groups)
1452 for i in range(self.n_splits):
1453 # random partition
-> 1454 permutation = rng.permutation(n_samples)
1455 ind_test = permutation[:n_test]
1456 ind_train = permutation[n_test:(n_test + n_train)]
mtrand.pyx in numpy.random.mtrand.RandomState.permutation()
MemoryError: Unable to allocate 756. KiB for an array with shape (193452,) and data type int32
Upvotes: 0
Views: 295
Reputation: 307
193452 * 4 bytes = 773808 bytes or about 773.8 KB certainly 756 KB is not enough.
Check the source code, might be a dimension error.
Upvotes: 1