Reputation: 321
I am trying to calculate the Jarque-Bera-Bera test (normality test) on my data that look like that (after chain operation) :
Data:
ranking Q1 Q2 Q3 Q4
Date
2009-12-29 nan nan nan nan
2009-12-30 0.12 -0.21 -0.36 -0.39
2009-12-31 0.05 0.09 0.06 -0.02
2010-01-01 nan nan nan nan
2010-01-04 1.45 1.90 1.81 1.77
... ... ... ... ...
2020-10-13 -0.67 -0.59 -0.63 -0.61
2020-10-14 -0.05 -0.12 -0.05 -0.13
2020-10-15 -1.91 -1.62 -1.78 -1.91
2020-10-16 1.21 1.13 1.09 1.37
2020-10-19 -0.03 0.01 0.06 -0.02
I use a function like that :
(data
.sort_values('Date')
.groupby([pd.Grouper(key='Date', freq='B'), 'ranking'])
['perf_corr']
.apply(lambda x: x.mean()*100)
.unstack()
.agg([lambda x: x.mean(),
lambda x: np.sqrt(x.var()),
lambda x: x.skew(),
lambda x: x.kurtosis(),
])
)
The output is :
Q1 Q2 Q3 Q4
<lambda> 8.89 9.20 7.63 7.30
<lambda> 15.77 16.19 16.93 17.59
<lambda> -1.04 -0.95 -0.79 -0.61
...
My question is simple how to replace with 'mean', 'std', 'skew',...(I also calculate others functions) inside my chain calculation ? Before the new pandas version, the lambda function were <lambda_1>,...so I used :
.rename(index={'lambda_1': "mean"})
but now it is not possible anymore
Any idea to proceed ?
Upvotes: 0
Views: 470
Reputation: 321
I also found as a solution :
data.set_axis(['mean','std'],axis='index')
and it works
Upvotes: 1
Reputation: 4648
List the desired row names before the pipeline:
idx_names = ["mean", "std", "skew", "kurtosis"] # can be more
and chain this right after .agg()
.reset_index(drop=True).rename(dict(zip(range(len(idx_names)), idx_names)))
Restore the index to [0,1,2,...] and create a dictionary that maps these numbers to the desired column names. The mapping dictionary looks like {0: "mean, 1: "std", 2: "skew", 3: "kurtosis"}
in the above example.
Upvotes: 0