Reputation: 698
I have the interpolated data of 3 numpy arrays.
Each of Length - 107952899
When i combine these three numpy array as pandas df, I am getting MemoryError
.
I have to do some calculations, pandas make it more easier, so i preferred doing with pandas. I believe that memory size of three numpy array crosses 3 Gb and more.
8Gb RAM python 3.6.3
I understand the reason for such a Error But Is there any possibility to avoid MemoryError, or some other best practice to be followed ??
Upvotes: 1
Views: 897
Reputation: 1939
When i combine these three numpy array as pandas df, I am getting MemoryError.
Let's say that you do:
import numpy as np
import pandas as pd
big_array_1 = np.array(np.random.random(10**7))
big_array_2 = np.array(np.random.random(10**7))
big_array_3 = np.array(np.random.random(10**7))
On my computer, it takes around 300 MB of memory.
Then if I do:
df = pd.DataFrame([big_array_1,big_array_2, big_array_3])
The memory soars up to 9Gb of ram. If you multiply it by a factor 10 (to get your 3 Gb of data instead of my 300), you will go up to 90 Gb which is probably more then your Ram + available swap, which would raise a MemoryError
.
But if instead, you do:
df = pd.DataFrame({"A":big_array_1, "B": big_array_2, "C":big_array_3})
then your usage of memory will not be significantly bigger than the one of your three arrays.
I suspect that it is your issue...
Upvotes: 2