Luke Ning
Luke Ning

Reputation: 1

max of multiple columns in pandas of multiple dataframes

I'm using df.corr() to create a correlation dataframe for multiple dfs that I'm working on. How do I find the max/min/mean/std.dev for all of the relative values in each of the separate dataframes and create a dataframe out of that?

Upvotes: 0

Views: 1516

Answers (2)

piRSquared
piRSquared

Reputation: 294488

IIUC, this provides the min, max, std... etc for the set of similarly positioned cells. So for column ('X', 'X') you'll get the stats for the df.loc['X', 'X'] cell across all dataframes in lodf

np.random.seed([3,1415])

lodf = [
    pd.DataFrame(
        np.random.randint(10, size=(3, 3)),
        list('XYZ'), list('XYZ')
    ) for _ in range(100)
]

pd.concat(
    dict(enumerate([d.stack() for d in lodf]))
).unstack(level=[1, 2]).describe()

           X                    Y                    Z              
           X      Y      Z      X      Y      Z      X      Y      Z
count 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
mean    4.56   4.81   4.93   4.28   4.31   4.49   4.54   4.53   4.60
std     2.99   2.81   2.82   3.06   2.97   3.02   3.05   2.80   2.87
min     0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00
25%     2.00   2.75   2.00   1.00   2.00   2.00   1.75   2.00   3.00
50%     4.00   5.00   5.50   5.00   4.00   4.00   5.00   4.00   4.00
75%     7.00   7.00   7.00   7.00   7.00   7.00   7.00   7.00   7.25
max     9.00   9.00   9.00   9.00   9.00   9.00   9.00   9.00   9.00

Upvotes: 1

Scott Boston
Scott Boston

Reputation: 153500

Use pd.concat,describe with list comprehension:

Inputs:

df1 = pd.DataFrame(np.random.rand(3,5),columns=list('ABCDE'))
df2 = pd.DataFrame(np.random.rand(3,5),columns=list('ABCDE'))
df3 = pd.DataFrame(np.random.rand(3,5),columns=list('ABCDE'))

pd.concat([i.describe() for i in [df1,df2,df3]], keys=['df1','df2','df3'], axis=1)

Output:

            df1                                               df2            \
              A         B         C         D         E         A         B   
count  3.000000  3.000000  3.000000  3.000000  3.000000  3.000000  3.000000   
mean   0.333877  0.428859  0.871313  0.627086  0.674608  0.427097  0.550675   
std    0.306857  0.378634  0.086694  0.286641  0.221984  0.382306  0.167861   
min    0.035033  0.143601  0.795605  0.432969  0.441879  0.040908  0.402787   
25%    0.176734  0.214075  0.824027  0.462473  0.569911  0.237947  0.459449   
50%    0.318435  0.284549  0.852450  0.491976  0.697942  0.434986  0.516111   
75%    0.483299  0.571488  0.909168  0.724144  0.790973  0.620191  0.624619   
max    0.648163  0.858428  0.965886  0.956312  0.884003  0.805397  0.733128   

                                          df3                                \
              C         D         E         A         B         C         D   
count  3.000000  3.000000  3.000000  3.000000  3.000000  3.000000  3.000000   
mean   0.506573  0.495343  0.542382  0.609385  0.577433  0.426975  0.201287   
std    0.346116  0.238650  0.150438  0.133651  0.369295  0.426809  0.233817   
min    0.121840  0.242962  0.446248  0.457182  0.222946  0.027543  0.031918   
25%    0.363533  0.384337  0.455698  0.560287  0.386180  0.202110  0.067901   
50%    0.605227  0.525712  0.465148  0.663393  0.549413  0.376677  0.103884   
75%    0.698939  0.621533  0.590449  0.685487  0.754676  0.626691  0.285972   
max    0.792651  0.717354  0.715750  0.707581  0.959939  0.876705  0.468060   


              E  
count  3.000000  
mean   0.664598  
std    0.037764  
min    0.625907  
25%    0.646217  
50%    0.666527  
75%    0.683943  
max    0.701360 

Upvotes: 2

Related Questions