Newskooler
Newskooler

Reputation: 4245

How to convert a list of pandas dataframes into a 3d numpy array?

If I have a list of pd.DataFrame like so:

df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
df_list = [df, df]

Question: How can I convert it to a 3D np.array with shape (2, 3, 5)?

I tried to do np.array(df_lsit), but I get the following error:

ValueError: cannot copy sequence with size 4 to array axis with dimension 5

Upvotes: 5

Views: 3801

Answers (2)

William Wang
William Wang

Reputation: 343

Use map() and df.to_numpy():

df = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
df_list = [df, df]

np_array = np.array(list(map(lambda x: x.to_numpy(), df_list)))

# to make sure the shape of np_array is correct
np_array = np_array.reshape((x, y, z))

The column order in the Zth dimension will be the same as the column order in the pandas DataFrame. So, if you desire a certain order of the columns, you need to recorder the columns before running df.to_numpy().

Of course, you can recorder the columns in NumPy form, but reordering them in pandas DataFrame will be much easier to debug.

Upvotes: 2

Aliakbar Saleh
Aliakbar Saleh

Reputation: 649

You should convert your DataFrame to numpy array then convert it to 3D array. Like this:

np.array([np.array(df), np.array(df)])

Upvotes: 1

Related Questions