Pandas: Take the median accross multiple dataframe

Question

This question was asked before on this site, but the suggested solution does not work for me. I have multiple dataframes, all with the same columns and index that looks like the one below:

       E          F          H          I  
row                                                                     
CE     17.917153 10.875160   9.970251  12.255511   
CF     9.780500  16.261098  10.021619   9.447307   
CH     12.293967 10.608844  10.870527  17.720458   
CI     9.967815  11.181572  17.550371  10.845565

Across all the dataframes, I want to take the median of each i,j element.

If I try for instance, storing all my dataframes in a dictionary called dict and do:

np.median(dict.values(), axis=0)

as suggested here, I get as an error:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

What is the correct way to proceed?

elyase · Accepted Answer

A "pandy" way to do this is to create a panel. For example:

>>> l = [df1, df2, df3, ...]
>>> panel = pd.Panel({i: df for i, df in enumerate(l)})
>>> panel

Dimensions: 2 (items) x 4 (major_axis) x 4 (minor_axis)
Items axis: 0 to 1
Major_axis axis: 0 to 3
Minor_axis axis: 0 to 3

Now to calculate the median just do:

panel.median(axis=0)

Pandas: Take the median accross multiple dataframe

Answers (1)

Related Questions