Reputation: 1330
I have a dataframe with the below biweekly data
date value
15-06-2012 20
30-06-2012 30
And I need to join with another dataframe that has below data:
date cost
2-05-2011 5
3-04-2012 80
2-06-2012 10
3-06-2012 10
4-06-2012 30
5-06-2012 20
10-06-2012 10
15-06-2012 10
18-06-2012 30
20-06-2012 20
21-06-2012 30
22-06-2012 30
29-06-2012 20
29-10-2012 30
I need to join 2 dataframes in such a way that from another dataframe, i get average cost between 1-15 th june 2012 to fill 15-06-2012 cost and similarly for 30-06-2012 cost, I get avg value between 16-06-2012 to 30-06-2012 and get the below results
date value cost
15-06-2012 20 15 which is (10+10+30+20+10+10)/6
30-06-2012 30 26 which is (30+20+30+30+20)/5
Upvotes: 3
Views: 157
Reputation: 323226
Change to datetime of your columns date , then we using merge_asof
#df.date=pd.to_datetime(df.date,dayfirst=True)
#df1.date=pd.to_datetime(df1.date,dayfirst=True)
df['keepkey']=df.date
mergedf=pd.merge_asof(df1,df,on='date',direction ='forward')
mergedf.groupby('keepkey',as_index=False).mean()
Out[373]:
keepkey cost value
0 2012-06-15 15 20
1 2012-06-30 26 30
Update :
df['keepkey']=df.date
df['key']=df.date.dt.strftime('%Y-%m')
df1['key']=df1.date.dt.strftime('%Y-%m')
mergedf=pd.merge_asof(df1,df,on='date',by='key',direction ='forward')
mergedf.groupby('keepkey',as_index=False).mean()
Out[417]:
keepkey cost key value
0 2012-06-15 15 6 20.0
1 2012-06-30 26 6 30.0
Upvotes: 6
Reputation: 402363
This would need a merge
followed by a groupby
:
m = df.merge(df2, on='date', how='outer')
m['date'] = pd.to_datetime(m.date, dayfirst=True)
m = m.sort_values('date')
(m.groupby(m['value'].notnull().shift().fillna(False).cumsum(),
as_index=False)
.agg({'date' : 'last', 'cost' : 'mean', 'value' : 'last'}))
date cost value
0 2012-06-15 15.0 20.0
1 2012-06-30 26.0 30.0
Upvotes: 3