Reputation: 179
I am looking to populate a new dataframe column with True if two cell values match another smaller subset dataframe in pandas, otherwise with a value of False.
For instance, this is original output dataframe I am constructing.
ID Type
1 A
2 B
3 A
4 A
5 C
6 A
7 D
8 A
9 B
10 A
And the smaller subset of the dataframe selected based on some criteria:
ID Type
1 A
3 A
4 A
5 C
7 D
10 A
What I am trying to accomplish is when ID and Type in the output dataframe match with the smaller subset datadrame, I want to populate a new column called 'Result' and value equals to True. Otherwise, value equals to False.
ID Type Result
1 A True
2 B False
3 A True
4 A True
5 C True
6 A False
7 D True
8 A False
9 B False
10 A True
Upvotes: 0
Views: 273
Reputation: 23217
You can .merge()
the 2 dataframes using a left merge with the original dataframe as base and turn on the indicator=
parameter to show the merge result. Then change the merge result to True
for the rows that appear in both dataframes and False
otherwise.
df_out = df1.merge(df2, on=['ID', 'Type'] , how='left', indicator='Result')
df_out['Result'] = (df_out['Result'] == 'both')
Explanation:
With indicator=
parameter turn on, Pandas will show you the merge result of which dataframe the current row are from (in terms of both
, left_only
and right_only
)
df_out = df1.merge(df2, on=['ID', 'Type'] , how='left', indicator='Result')
print(df_out)
ID Type Result
0 1 A both
1 2 B left_only
2 3 A both
3 4 A both
4 5 C both
5 6 A left_only
6 7 D both
7 8 A left_only
8 9 B left_only
9 10 A both
Then, we transform the both
and others to True
/False
by boolean mask, as follows:
df_out['Result'] = (df_out['Result'] == 'both')
print(df_out)
ID Type Result
0 1 A True
1 2 B False
2 3 A True
3 4 A True
4 5 C True
5 6 A False
6 7 D True
7 8 A False
8 9 B False
9 10 A True
Upvotes: 1