Reputation: 17
df1 = pd.DataFrame(np.column_stack([CIK, period, data]), columns=['CIK','Period','Text'])
I have 3 lists which I want to be columns of my dataframe. Above code worked fine when my data was small. Now this gives me memory error. Am I missing something? Is there a different way to do this?
Upvotes: 0
Views: 66
Reputation: 402353
You could build a dataframe by passing a dict
to it.
i = ['CIK','Period','Text']
j = [CIK, period, data]
df = pd.DataFrame(dict(zip(i, j))
This is cheap as it doesn't result in creating copies of your data. The dict
simply generates key-value pairs around the references (there's no need to create any data copies, only references are being moved around). Unfortunately, with your column_stack
call, the arrays must be stacked into a freshly allocated array and a new result returned, which is wasteful.
Upvotes: 2