moe_95
moe_95

Reputation: 417

Merge pandas DataFrames based on time interval

I am attempting to merge two DataFrames df_1 containing a weekly count and df_2 containing the average count per month.

Weekly count (df_1) seen below.

date	              Direct_Connection	    Indirect_Connection
2014-12-07        	36	                  124
2014-12-14        	38	                  111
2014-12-21        	28	                  95
2014-12-28        	15	                  58
2015-01-04        	5	                  9
2015-01-11        	22	                  85
2015-01-18        	49	                  174
2015-01-25        	47	                  171
2015-02-01        	54	                  187
2015-02-08 	        45                        216
2015-02-15 	        29                     	  125
2015-02-22        	22	                  63
2015-03-01        	27	                  96

Monthly Average (df_2) seen below.

date	      Direct_Connection   Indirect_Connection
2014-12-31	29.25	              97.0
2015-01-31	30.75	              109.75
2015-02-28	37.5	              147.75
2015-03-31	31.0	              119.0

The merge I would like to complete would add an 'Associated Monthly Average' columns to df_1 (Weekly Count) for both 'Direct_connection' and 'Indirect Connection'

An example of my desired output can be seen below.

date	              Direct_Connection	Assoc_dir_avg  Indirect_Connection Ass_indir_avg
2014-12-07        	36	              29.5           124                 97
2014-12-14        	38	              29.5           111                 97
2014-12-21        	28	              29.5           95                  97
2014-12-28        	15	              29.5           58                  97
2015-01-04        	5	              30.5           9                   109.75 
2015-01-11        	22	              30.5           85                  109.75
2015-01-18        	49	              30.5           174                 109.75
2015-01-25        	47	              30.5           171                 109.75
2015-02-01        	54	              37.5           187                 147.75
2015-02-08 	        45                    37.5           216             147.75
2015-02-15 	        29                    37.5     	     125             147.75
2015-02-22        	22	              37.5            63                 147.75
2015-03-01        	27	              31              96                 119

The above DataFrames are only a brief snippet of the entire DF's, which span over a number of years.

Any assistance anyone could provide would be greatly appreciated!

Upvotes: 0

Views: 63

Answers (1)

Vivek Kalyanarangan
Vivek Kalyanarangan

Reputation: 9081

Use -

# Cast to date if not done already 
df1['date'] = pd.to_datetime(df1['date'].str.strip(), format='%Y-%m-%d')
df2['date'] = pd.to_datetime(df2['date'].str.strip(), format='%Y-%m-%d')

from pandas.tseries.offsets import MonthEnd

df=df1.merge(df2, left_on=(df1['date'] + MonthEnd(1)), right_on='date').rename(columns={'Direct_Connection_x':'Direct_Connection',
                                                                        'Indirect_Connection_x':'Indirect_Connection',
                                                                        'Direct_Connection_y':'Assoc_dir_avg',
                                                                        'Indirect_Connection_y':'Ass_indir_avg',
                                                                        'date_x':'date'}).drop('date_y', axis=1)

Output

         date                      Direct_Connection  Indirect_Connection  \
0  2014-12-07                                     36                124.0   
1  2014-12-14                                     38                111.0   
2  2014-12-21                                     28                 95.0   
3  2014-12-28                                     15                 58.0   
4  2015-01-04                                      5                  9.0   
5  2015-01-11                                     22                 85.0   
6  2015-01-18                                     49                174.0   
7  2015-01-25                                     47                171.0   
8  2015-02-01                                     54                187.0   
9  2015-02-08          45                        216                  NaN   
10 2015-02-15                29                                     125.0   
11 2015-02-22                                     22                 63.0   
12 2015-03-01                                     27                 96.0   

    Assoc_dir_avg  Ass_indir_avg  
0           29.25          97.00  
1           29.25          97.00  
2           29.25          97.00  
3           29.25          97.00  
4           30.75         109.75  
5           30.75         109.75  
6           30.75         109.75  
7           30.75         109.75  
8           37.50         147.75  
9           37.50         147.75  
10          37.50         147.75  
11          37.50         147.75  
12          31.00         119.00  

Upvotes: 1

Related Questions