Reputation: 367
I have pandas DataFrame like this:
feature name 1 feature name 2
0 A B
1 A B
2 A C
3 B C
4 B D
And I want to get a list of "features name 2" without names which occurring in "feature name 1" So desired output would look like this:
list = [C,D]
since B is occurring in first column.
Upvotes: 1
Views: 33
Reputation: 862671
Use Series.isin
in boolean indexing
:
mask = df['feature name 2'].isin(df['feature name 1'])
L = df.loc[~mask, 'feature name 2'].unique().tolist()
Or numpy.setdiff1d
:
L = np.setdiff1d(df['feature name 2'], df['feature name 1']).tolist()
print (L)
['C', 'D']
Upvotes: 1