Reputation: 1946
I have two pandas DataFrames:
df1
key id count
100 9821 7
200 9813 10
df2
nodekey nodeid
100 9821
200 9813
If the nodekey+nodeid in df2 match key+id in df1, count in df1 has to be set to 0. So, the result of the example above should be;
key id count
100 9821 0
200 9813 0
I tried the following (matching on key and nodekey only, as a test) but receive an error:
df1['count']=np.where((df1.key == df2.nodekey),0)
ValueError: either both or neither of x and y should be given
Suggestions?
Upvotes: 1
Views: 62
Reputation: 57033
Merge the dataframes using the left merge (the rows that are present in df1 but not in df2 will be filled with nan
s):
combined = df1.merge(df2, left_on=['key', 'id'],
right_on=['nodekey', 'nodeid'], how='left')
Update the counts for the rows that are non-nan
:
combined.loc[combined.nodekey.notnull(), 'count'] = 0
Cleanup the unwanted columns:
combined.drop(['nodekey', 'nodeid'], axis=1, inplace=True)
# key id count
#0 100 9821 0
#1 200 9813 0
#2 300 9855 7
Upvotes: 0
Reputation: 59264
This should work
df1.loc[df1[['key', 'id']].transform(tuple,1).isin(df2[['nodekey', 'nodeid']].transform(tuple,1)), "count"] = 0
which is basically using
df.loc[mask, 'count']=0
where mask
is True
for rows where tuple ('key', 'id')
matches any tuple ('nodekey', 'nodeid')
Upvotes: 2