Reputation: 532
I would like to merge two dataframes of different length on two columns which have a common element partially.
The index of the left_dataframe (A) is of datetime
type and the same date will appear multiply but with different times (hence, index.date
does not help).
The index of the right_dataframe (B) is of datetime.date
type and each date is distinct, as expected.
A=pd.DataFrame({'datetime':['2019-06-01 18:11:55', '2019-06-01 21:43:02','2019-07-23 09:07:18', '2019-07-24 10:32:24'], \
'value 1':[2, 5, 80, 0]})
B=pd.DataFrame({'date':['2019-06-01', '2019-07-23', '2019-07-24'], \
'value 2':[10, 7, 3]})
I need to merge the two dataframes on dates and, particularly, by placing the elements of B at the rows where the first new date appears and filling in the rest same-dates-different-times with 0
, so the output should be something like this (along with comments):
datetime value 1 value 2
2019-06-01 18:11:55 2 10 #this is the first 2019-06-01 --> so it got the value of dataframe B
2019-06-01 21:43:02 5 0 #this is the second 2019-06-01 --> so the value 2 column got filled in with a 0 value
2019-07-23 09:07:18 80 7
2019-07-24 10:32:24 0 3
Your input is more than welcome ^_^
Upvotes: 2
Views: 1004
Reputation: 862441
Use:
#convert columns to dates
B['date'] = pd.to_datetime(B['date']).dt.date
#convert to columns datetimes
A['datetime'] = pd.to_datetime(A['datetime'])
Create new columns - date
s from datetime
s in A
by Series.dt.date
for match by B['date']
and helper columns for merge by duplicates of date
s by GroupBy.cumcount
:
A['date'] = A['datetime'].dt.date
A['g'] = A.groupby('date').cumcount()
B['g'] = B.groupby('date').cumcount()
#print (A)
#print (B)
Then use DataFrame.merge
with both columns and left join, remove helper column and convert missing values of added column to 0
by Series.fillna
:
df = A.merge(B, on=['date','g'], how='left').drop(['date','g'], axis=1)
df['value 2'] = df['value 2'].fillna(0, downcast='int')
print (df)
datetime value 1 value 2
0 2019-06-01 18:11:55 2 10
1 2019-06-01 21:43:02 5 0
2 2019-07-23 09:07:18 80 7
3 2019-07-24 10:32:24 0 3
Upvotes: 1