Agile_Eagle
Agile_Eagle

Reputation: 1839

Why am I getting a MemoryError in Python for only 756 KiB?

I am trying to run the following binary classification Model with XGBoost. X has the features, and y has the binary classification.

X_trains, X_tests, y_trains, y_tests = train_test_split(X, y, test_size=0.05, random_state=56)
xgb_model = xgb.XGBClassifier(max_depth=5, learning_rate=0.08, objective= 'binary:logistic',n_jobs=-1).fit(X_trains, y_trains)
print('Accuracy of XGB classifier on training set: {:.2f}'
       .format(xgb_model.score(X_trains, y_trains)))
print('Accuracy of XGB classifier on test set: {:.2f}'
       .format(xgb_model.score(X_tests[X_trains.columns], y_tests)))

I am getting a memory error, but only for 756 KiB. My laptop has far more RAM than that, so why is this happening? If it helps, I am running this in Jupyter Notebook.

   ---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
 in 
----> 1 X_trains, X_tests, y_trains, y_tests = train_test_split(X, y, test_size=0.05, random_state=56)
      2 xgb_model = xgb.XGBClassifier(max_depth=5, learning_rate=0.08, objective= 'binary:logistic',n_jobs=-1).fit(X_trains, y_trains)
      3 print('Accuracy of XGB classifier on training set: {:.2f}'
      4        .format(xgb_model.score(X_trains, y_trains)))
      5 print('Accuracy of XGB classifier on test set: {:.2f}'

C:\Python38\lib\site-packages\sklearn\model_selection\_split.py in train_test_split(*arrays, **options)
   2150                      random_state=random_state)
   2151 
-> 2152         train, test = next(cv.split(X=arrays[0], y=stratify))
   2153 
   2154     return list(chain.from_iterable((_safe_indexing(a, train),

C:\Python38\lib\site-packages\sklearn\model_selection\_split.py in split(self, X, y, groups)
   1339         """
   1340         X, y, groups = indexable(X, y, groups)
-> 1341         for train, test in self._iter_indices(X, y, groups):
   1342             yield train, test
   1343 

C:\Python38\lib\site-packages\sklearn\model_selection\_split.py in _iter_indices(self, X, y, groups)
   1452         for i in range(self.n_splits):
   1453             # random partition
-> 1454             permutation = rng.permutation(n_samples)
   1455             ind_test = permutation[:n_test]
   1456             ind_train = permutation[n_test:(n_test + n_train)]

mtrand.pyx in numpy.random.mtrand.RandomState.permutation()

MemoryError: Unable to allocate 756. KiB for an array with shape (193452,) and data type int32

Upvotes: 0

Views: 295

Answers (1)

Lucius Hu
Lucius Hu

Reputation: 307

193452 * 4 bytes = 773808 bytes or about 773.8 KB certainly 756 KB is not enough.

Check the source code, might be a dimension error.

Upvotes: 1

Related Questions