Reputation: 95
I am manipulating time series data in a dataframe (df1) that has a bunch of input columns, 300 period columns, and 839826 rows.
If I try to only manipulate the 839826 x 300 section of this dataframe by multiplying it by a similarly shaped section of a different dataframe (df2):
df1.iloc[:, 0:301] = df1.iloc[:, 0:301] * df2.iloc[:, 0:301]
I get this error:
Unable to allocate 1.88 GiB for an array with shape (301, 839826) and data type float64
I found the answer to a similar question, but the solution was for Linux and I am working on Windows. I have read online I should use Dask, but I am not sure about how to implement that in here, or whether it's even the right solution to go for.
Upvotes: 0
Views: 9097
Reputation: 28684
The line
df1.iloc[:, 0:301] = df1.iloc[:, 0:301] * df2.iloc[:, 0:301]
first allocates a temporary array/dataframe from the result of the multiplication, before assigning it into the output. You can prevent this by doing only in-place operations:
df1.iloc[:, 0:301] = df1.iloc[:, 0:301]
df1.iloc[:, 0:301] *= df2.iloc[:, 0:301]
This might get you over your immediate hurdle - but indeed do investigate Dask in case you are facing this kind of situation a lot.
Upvotes: 1