Aly
Aly

Reputation: 367

How to get string list from data frame column which are not repeating in other column?

I have pandas DataFrame like this:

feature name 1  feature name 2
0   A            B
1   A            B 
2   A            C 
3   B            C 
4   B            D 

And I want to get a list of "features name 2" without names which occurring in "feature name 1" So desired output would look like this:

list = [C,D] 

since B is occurring in first column.

Upvotes: 1

Views: 33

Answers (1)

jezrael
jezrael

Reputation: 862671

Use Series.isin in boolean indexing:

mask = df['feature name 2'].isin(df['feature name 1'])
L = df.loc[~mask, 'feature name 2'].unique().tolist()

Or numpy.setdiff1d:

L = np.setdiff1d(df['feature name 2'], df['feature name 1']).tolist()
print (L)
['C', 'D']

Upvotes: 1

Related Questions