Reputation: 51
I've a data frame (portfolios) like this
Date ReturnonEquity
31-Jan-10 0.67
31-Jan-10 1.00
31-Jan-10 0.49
31-Jan-10 0.15
28-Feb-10 0.80
28-Feb-10 0.08
28-Feb-10 0.54
28-Feb-10 0.77
I calculated the winsorized Roe for every date -
portfolios.groupby(by='Date')['ROE'].apply(lambda x: zscore(mstats.winsorize(x[~np.isnan(x)],0.03)))
which outputs a series like this (ignore values)
2010-01-31 [0.67, 1.00, 0.49, 0.15]
2010-02-28 [0.80, 0.08, 0.54, 0.77]
how do i put the z-scores back in the portfolios as a new column (join creates the columns with the array, not like a continuous values)
Date ReturnonEquity RoE_winsorized
31-Jan-10 0.67 0.67
31-Jan-10 1.00 1.00
31-Jan-10 0.49 0.49
31-Jan-10 0.15 0.15
28-Feb-10 0.80 0.80
28-Feb-10 0.08 0.08
28-Feb-10 0.54 0.54
28-Feb-10 0.77 0.77
Thanks in advance!
Upvotes: 0
Views: 363
Reputation: 879531
Use transform (instead of apply), as chrisb suggests, and assign the result to portfolios[columname]
:
In [42]: portfolios['windsorized'] = portfolios.groupby(by='Date')['ROE'].transform(lambda x: (stats.zscore(mstats.winsorize(x[~np.isnan(x)],0.03))))
In [43]: portfolios
Out[43]:
Date ROE windsorized
0 2010-01-31 0.67 0.301112
1 2010-01-31 1.00 1.375348
2 2010-01-31 0.49 -0.284835
3 2010-01-31 0.15 -1.391624
4 2010-02-28 0.80 0.876605
5 2010-02-28 0.08 -1.623022
6 2010-02-28 0.54 -0.026038
7 2010-02-28 0.77 0.772454
Upvotes: 2