Reputation: 1383
I am getting a memory error when trying to use a mask to select values for a 4M rows table with 3 columns.
When I run df.memory_usage().sum()
it returns me 173526080
which equates to 1,38820864 gb
and I have 32gb of RAM. So it doesn't seem like it should run out of RAM as there is no previous code consuming lots of RAM.
This method worked for previous versions of code with the same 4M rows.
The code I run is:
x = df[exit_point] > 0
print(df[x].shape)
The error I get is:
File "C:\Users\joaoa\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2133, in __getitem__
return self._getitem_array(key)
File "C:\Users\joaoa\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2175, in _getitem_array
return self._take(indexer, axis=0, convert=False)
File "C:\Users\joaoa\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\generic.py", line 2143, in _take
self._consolidate_inplace()
File "C:\Users\joaoa\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\generic.py", line 3677, in _consolidate_inplace
self._protect_consolidate(f)
File "C:\Users\joaoa\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\generic.py", line 3666, in _protect_consolidate
result = f()
File "C:\Users\joaoa\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\generic.py", line 3675, in f
self._data = self._data.consolidate()
File "C:\Users\joaoa\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 3826, in consolidate
bm._consolidate_inplace()
File "C:\Users\joaoa\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 3831, in _consolidate_inplace
self.blocks = tuple(_consolidate(self.blocks))
File "C:\Users\joaoa\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 4853, in _consolidate
_can_consolidate=_can_consolidate)
File "C:\Users\joaoa\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 4876, in _merge_blocks
new_values = new_values[argsort]
MemoryError
I am lost on how to start debugging this. Any clues and hints would be very much appreciated.
Upvotes: 0
Views: 333
Reputation: 44
Maybe this helps:
[1] Use the low_memory=False argument while importing the file. For example:
df = pd.read_csv('filepath', low_memory=False)
[2] Use the dtype argument while importing the file.
[3] If you use Jupyter Notebook: Kernel > Restart & Clear Output.
Hope this helps!
Upvotes: 1