Reputation: 1
I'm using df.corr() to create a correlation dataframe for multiple dfs that I'm working on. How do I find the max/min/mean/std.dev for all of the relative values in each of the separate dataframes and create a dataframe out of that?
Upvotes: 0
Views: 1516
Reputation: 294488
IIUC, this provides the min, max, std... etc for the set of similarly positioned cells. So for column ('X', 'X')
you'll get the stats for the df.loc['X', 'X']
cell across all dataframes in lodf
np.random.seed([3,1415])
lodf = [
pd.DataFrame(
np.random.randint(10, size=(3, 3)),
list('XYZ'), list('XYZ')
) for _ in range(100)
]
pd.concat(
dict(enumerate([d.stack() for d in lodf]))
).unstack(level=[1, 2]).describe()
X Y Z
X Y Z X Y Z X Y Z
count 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
mean 4.56 4.81 4.93 4.28 4.31 4.49 4.54 4.53 4.60
std 2.99 2.81 2.82 3.06 2.97 3.02 3.05 2.80 2.87
min 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
25% 2.00 2.75 2.00 1.00 2.00 2.00 1.75 2.00 3.00
50% 4.00 5.00 5.50 5.00 4.00 4.00 5.00 4.00 4.00
75% 7.00 7.00 7.00 7.00 7.00 7.00 7.00 7.00 7.25
max 9.00 9.00 9.00 9.00 9.00 9.00 9.00 9.00 9.00
Upvotes: 1
Reputation: 153500
Use pd.concat
,describe
with list comprehension:
Inputs:
df1 = pd.DataFrame(np.random.rand(3,5),columns=list('ABCDE'))
df2 = pd.DataFrame(np.random.rand(3,5),columns=list('ABCDE'))
df3 = pd.DataFrame(np.random.rand(3,5),columns=list('ABCDE'))
pd.concat([i.describe() for i in [df1,df2,df3]], keys=['df1','df2','df3'], axis=1)
Output:
df1 df2 \
A B C D E A B
count 3.000000 3.000000 3.000000 3.000000 3.000000 3.000000 3.000000
mean 0.333877 0.428859 0.871313 0.627086 0.674608 0.427097 0.550675
std 0.306857 0.378634 0.086694 0.286641 0.221984 0.382306 0.167861
min 0.035033 0.143601 0.795605 0.432969 0.441879 0.040908 0.402787
25% 0.176734 0.214075 0.824027 0.462473 0.569911 0.237947 0.459449
50% 0.318435 0.284549 0.852450 0.491976 0.697942 0.434986 0.516111
75% 0.483299 0.571488 0.909168 0.724144 0.790973 0.620191 0.624619
max 0.648163 0.858428 0.965886 0.956312 0.884003 0.805397 0.733128
df3 \
C D E A B C D
count 3.000000 3.000000 3.000000 3.000000 3.000000 3.000000 3.000000
mean 0.506573 0.495343 0.542382 0.609385 0.577433 0.426975 0.201287
std 0.346116 0.238650 0.150438 0.133651 0.369295 0.426809 0.233817
min 0.121840 0.242962 0.446248 0.457182 0.222946 0.027543 0.031918
25% 0.363533 0.384337 0.455698 0.560287 0.386180 0.202110 0.067901
50% 0.605227 0.525712 0.465148 0.663393 0.549413 0.376677 0.103884
75% 0.698939 0.621533 0.590449 0.685487 0.754676 0.626691 0.285972
max 0.792651 0.717354 0.715750 0.707581 0.959939 0.876705 0.468060
E
count 3.000000
mean 0.664598
std 0.037764
min 0.625907
25% 0.646217
50% 0.666527
75% 0.683943
max 0.701360
Upvotes: 2