Reputation: 2181
df1:
tID sID dID date1 date2
1234 4321 5432 7/12 8/13
7890 5688 4567 8/21 9/30
df2:
sID dID date3
4321 5432 7/20
5688 4567 9/15
I'd like to assign append a tID
to df2
if:
date3
is between date1
and date2
sID
are the samedID
are the samedf3:
tID sID dID date3
1234 4321 5432 7/20
7890 5688 4567 9/15
My first thought was to loop through both dataframes and use if
statements but that really doesn't seem like an efficient approach.
Any help would be much appreciated.
Upvotes: 2
Views: 70
Reputation: 17359
df3 = pd.merge(df1, df2, on=['sID', 'dID'])
print df3[df3.date3.between(df3.date1, df3.date2)].drop(['date1', 'date2'], axis=1)
tID sID dID date3
0 1234 4321 5432 7/20
1 7890 5688 4567 9/15
Thanks @RafaelC for the between
idea.
Upvotes: 2
Reputation: 59284
IIUC, first set them to the same index (as str
, if the IDs might contain ids such as 8/21
before the edit. If all ids are ints, no need to do the astype
steps)
df['sID'] = df.sID.astype(str)
df['dID'] = df.dID.astype(str)
df = df.set_index(['sID', 'dID'])
df2['sID'] = df2.sID.astype(str)
df2['dID'] = df2.dID.astype(str)
df2 = df2.set_index(['sID', 'dID'])
Then filter and assign using loc
m = df2.index.isin(df.index)
sub = df.loc[df2[m].date3.index]
s = df2[m].date3.between(sub.date1, sub.date2)
df2.loc[:, 'tID'] = df.loc[s[s].index, 'tID']
Outputs
date3 tID
sID dID
4321 5432 7/20 1234
5688 4567 9/15 7890
Upvotes: 2