Reputation: 483
I have a dataframe df1, which stores the current registered course and the previous course taken.
course | previous_course |
---|---|
CS201 | CS101 |
CS201 | CS102 |
CS201 | CS103 |
CS202 | CS101 |
CS202 | CS102 |
CS202 | CS103 |
Some courses have prerequisite, which are also stored in dataframe df2 (it's one to many relationship, one course could have many prerequisite)
course | prerequiste |
---|---|
CS201 | CS101 |
CS202 | CS102 |
CS202 | CS103 |
I would like to match if the previous_course in df1 is prerequisite or not, and create another column (1 or 0). Could you advise what should I do? Thanks a lot!
Upvotes: 2
Views: 74
Reputation: 23227
You can do a left .merge()
on df1
and df2
and then create the new column based on the joined prerequisite
column using np.where()
, as follows:
df3 = pd.merge(df1, df2, left_on=['course', 'previous_course'], right_on=['course', 'prerequiste'], how='left')
df3['is_prerequiste'] = np.where(df3['prerequiste'].notna(), 1, 0)
print(df3)
course previous_course prerequiste is_prerequiste
0 CS201 CS101 CS101 1
1 CS201 CS102 NaN 0
2 CS201 CS103 NaN 0
3 CS202 CS101 NaN 0
4 CS202 CS102 CS102 1
5 CS202 CS103 CS103 1
Upvotes: 3
Reputation: 863531
Use indicator parameter in DataFrame.merge
, compare by both
and cast to integers for map True, False
to 1,0
:
df = pd.merge(df1, df2, left_on=['course', 'previous_course'],
right_on=['course', 'prerequiste'],
how='left',
indicator='is_prerequiste')
df['is_prerequiste'] = df['is_prerequiste'].eq('both').astype(int)
print (df)
course previous_course prerequiste is_prerequiste
0 CS201 CS101 CS101 1
1 CS201 CS102 NaN 0
2 CS201 CS103 NaN 0
3 CS202 CS101 NaN 0
4 CS202 CS102 CS102 1
5 CS202 CS103 CS103 1
Upvotes: 3