Reputation: 1189
I have a large dataframe and a small dataframe that I would like to join together. The small dataframe holds the maximum weekly value corresponding to the range of values in the larger dataframe by group. I want to join the maximum value per week per group to the large frame.
large frame:
L
datetime group value
2014-05-07 A 3
2014-05-07 B 3
2014-05-14 A 4
2014-05-14 B 2
2014-05-15 A 6
2014-05-15 B 4
2014-05-16 A 7
2014-05-16 B 10
small frame:
S
weeknumber group maxval
1 A 3
1 B 3
2 A 7
2 B 10
Note these are hypothetical week values, the actual values will correspond to the week number in the year, out of 52.
To calculate the small frame, we can set an index on our datetime:
import pandas as pd
L.index = pd.to_datetime(L.datetime)
We can group by groups and week like so:
week = L.groupby([(lambda x : x.week),'group'], as_index=False)
Calculating the weekly max is simple:
weekmax = week.value.max()
This returns the small frame and is where I get stuck. I have tried merging the dataframes and performing joins but can't get it to work.
EDIT: This is what I want the final frame to look like after the join:
datetime group value maxval
2014-05-07 A 3 3
2014-05-07 B 3 3
2014-05-14 A 4 7
2014-05-14 B 2 10
2014-05-15 A 6 7
2014-05-15 B 4 10
2014-05-16 A 7 7
2014-05-16 B 10 10
Upvotes: 0
Views: 491
Reputation: 52266
You want to use transform
which will return a like-indexed version of the aggregation. So you can assign directly back to the larger frame, like this:
In [189]: L['maxval'] = week['value'].transform('max')
In [190]: L
Out[190]:
datetime group value maxval
datetime
2014-05-07 2014-05-07 A 3 3
2014-05-07 2014-05-07 B 3 3
2014-05-14 2014-05-14 A 4 7
2014-05-14 2014-05-14 B 2 10
2014-05-15 2014-05-15 A 6 7
2014-05-15 2014-05-15 B 4 10
2014-05-16 2014-05-16 A 7 7
2014-05-16 2014-05-16 B 10 10
Upvotes: 2