Reputation: 609
I have a multi-index dataframe of panel data that shows historical prices over time. What I basically want to do is group them by their average forecast over time- high (above 0.4), medium (between 0.2 and 0.4), and low (below 0.2).
price yield forecast
date assetid
1/1/2017 4200 96.44 0.23 0.64
1/1/2017 408 46.3 0.6 0.4
1/1/2017 413 50.68 0.47 0.73
1/1/2017 3911 82.48 0.33 0.84
1/1/2017 7392 97.24 0.4 0.62
1/1/2017 7144 31.86 0.18 0.54
1/1/2017 8793 59.66 0.65 0.9
1/2/2017 4200 57.1 0.21 0.69
1/2/2017 408 4.76 0.86 0.16
1/2/2017 413 70.79 0.24 0.12
1/2/2017 3911 5.43 0.91 0.44
1/2/2017 7392 47.33 0.51 0.18
1/2/2017 7144 17.85 0.79 0.59
1/2/2017 8793 98.08 0.2 0.24
So the first step I need to figure out is how to create a table that returns the mean value of each asset.
assetid Average of forecast
408 0.73
413 0.355
3911 0.62
4200 0.22
7144 0.485
7392 0.455
8793 0.425
Then, I have to just join this newly created table and match on the assetid, with the same classification over time.
price yield forecast type
date assetid
1/1/2017 4200 96.44 0.23 0.64 med
1/1/2017 408 46.3 0.6 0.4 high
1/1/2017 413 50.68 0.47 0.73 med
1/1/2017 3911 82.48 0.33 0.84 med
1/1/2017 7392 97.24 0.4 0.62 high
1/1/2017 7144 31.86 0.18 0.54 low
1/1/2017 8793 59.66 0.65 0.9 high
1/2/2017 4200 57.1 0.21 0.69 med
1/2/2017 408 4.76 0.86 0.16 high
1/2/2017 413 70.79 0.24 0.12 med
1/2/2017 3911 5.43 0.91 0.44 med
1/2/2017 7392 47.33 0.51 0.18 high
1/2/2017 7144 17.85 0.79 0.59 low
1/2/2017 8793 98.08 0.2 0.24 high
Upvotes: 3
Views: 56
Reputation: 323306
pd.cut
+ transform
df['type']=pd.cut(df.groupby(level='assetid')
.forecast
.transform('mean'),[0,0.2,0.4,np.inf],labels=['low','med','high'])
df
Out[663]:
price yield forecast type
date assetid
1/1/2017 4200 96.44 0.23 0.64 high
408 46.30 0.60 0.40 med
413 50.68 0.47 0.73 high
3911 82.48 0.33 0.84 high
7392 97.24 0.40 0.62 med
7144 31.86 0.18 0.54 high
8793 59.66 0.65 0.90 high
1/2/2017 4200 57.10 0.21 0.69 high
408 4.76 0.86 0.16 med
413 70.79 0.24 0.12 high
3911 5.43 0.91 0.44 high
7392 47.33 0.51 0.18 med
7144 17.85 0.79 0.59 high
8793 98.08 0.20 0.24 high
Upvotes: 3
Reputation: 153460
Let's use pd.cut
:
s = pd.cut(df.groupby('assetid')['forecast'].mean(),[0,.2,.4,np.inf], labels=['low','med','high'])
df['type'] = df.assetid.map(s)
print(df)
Output:
date assetid price yield forecast type
0 1/1/2017 4200 96.44 0.23 0.64 high
1 1/1/2017 408 46.30 0.60 0.40 med
2 1/1/2017 413 50.68 0.47 0.73 high
3 1/1/2017 3911 82.48 0.33 0.84 high
4 1/1/2017 7392 97.24 0.40 0.62 med
5 1/1/2017 7144 31.86 0.18 0.54 high
6 1/1/2017 8793 59.66 0.65 0.90 high
7 1/2/2017 4200 57.10 0.21 0.69 high
8 1/2/2017 408 4.76 0.86 0.16 med
9 1/2/2017 413 70.79 0.24 0.12 high
10 1/2/2017 3911 5.43 0.91 0.44 high
11 1/2/2017 7392 47.33 0.51 0.18 med
12 1/2/2017 7144 17.85 0.79 0.59 high
13 1/2/2017 8793 98.08 0.20 0.24 high
Upvotes: 2