Reputation: 1601
I have the following dataframe which I'll call 'names':
date name code
6/1/2018 A 5
6/1/2018 B 5
7/1/2018 A 5
7/1/2018 B 5
I have the following df which I need to alter:
date name comment
5/1/2018 A 'Good'
6/1/2018 A 'Good'
6/1/2018 B 'Good'
6/1/2018 C 'Good'
7/1/2018 A 'Good'
7/1/2018 B 'Good'
I need to change the comment to 'Bad' if the name isn't in the names dataframe for that date
Right now I have:
df['comment'] = np.where(~df['name'].isin(names['name']), 'Bad', df['comment'])
Though obviously that doesn't work because it doesn't take into account name AND date.
Final output:
date name comment
5/1/2018 A 'Bad'
6/1/2018 A 'Good'
6/1/2018 B 'Good'
6/1/2018 C 'Bad'
7/1/2018 A 'Good'
7/1/2018 B 'Good'
The first row was changed because there's no A entry for 5/1 in the names dataframe. The C row was changed because there's no C entry for 6/1 in the names df (or rather no C entry at all).
Note: Both dataframes (names and df) are larger than I've shown, both row and column-wise.
Upvotes: 2
Views: 4072
Reputation: 402523
Performant solution using pd.Index.get_indexer
:
v = names.set_index(['date', 'name'])
m = v.index.get_indexer(pd.MultiIndex.from_arrays([df.date, df.name])) == -1
df.loc[m, 'comment'] = '\'Bad\''
print(df)
date name comment
0 5/1/2018 A 'Bad'
1 6/1/2018 A 'Good'
2 6/1/2018 B 'Good'
3 6/1/2018 C 'Bad'
4 7/1/2018 A 'Good'
5 7/1/2018 B 'Good'
Alternatively, do a LEFT OUTER merge
, determine missing values in the right DataFrame, and use that to mask
rows:
m = df.merge(names, how='left', on=['date', 'name']).code.isna()
df['comment'] = df['comment'].mask(m, '\'Bad\'')
print(df)
date name comment
0 5/1/2018 A 'Bad'
1 6/1/2018 A 'Good'
2 6/1/2018 B 'Good'
3 6/1/2018 C 'Bad'
4 7/1/2018 A 'Good'
5 7/1/2018 B 'Good'
Upvotes: 2
Reputation: 164683
You can use pd.Index.isin
followed by pd.Series.where
:
idx_cols = ['date', 'name']
mask = df.set_index(idx_cols).index.isin(names.set_index(idx_cols).index)
df['comment'].where(mask, '\'Bad\'', inplace=True)
print(df)
date name comment
0 5/1/2018 A 'Bad'
1 6/1/2018 A 'Good'
2 6/1/2018 B 'Good'
3 6/1/2018 C 'Bad'
4 7/1/2018 A 'Good'
5 7/1/2018 B 'Good'
Upvotes: 2