Reputation: 99
I have a Data Frame in python, for example this one:
col1 col2 col3 col4
0 A C B D
1 C E E A
2 E A E A
3 A D D D
4 B B B B
5 D D D D
6 F F A F
7 E E E E
8 B B B B
The code for the creation of the dataframe:
d = {'col1':['A','C','E','A','B','D','F','E','B'], 'col2':['C','E','A','D','B','D','F','E','B'],
'col3':['B','E','E','D','B','D','A','E','B'], 'col4':['D','A','A','D','B','D','F','E','B']}
df = pd.DataFrame(data=d)
Let the list1 be ['A','C','E'] and list2 be ['B','D','F']. What I want is following: if in the col1 stays an element from the list1 and in one of the col2-col4 stays an element from the list2, then i want to eliminate the last one (so replace it by '').
I have tried df['col2'].loc[(df['col1'] in list1) & (df[['col2'] in list2)]=''
which is not quite what i want but al least goes in the right direction, unfortunately it doesn't work. Could someone help please?
This is my expected output:
col1 col2 col3 col4
0 A B D
1 C E E A
2 E A E A
3 A D D
4 B B B B
5 D D D D
6 F F A F
7 E E E E
8 B B B B
Upvotes: 1
Views: 46
Reputation: 164773
pd.DataFrame.loc
is a method of pd.DataFrame
, so use it with your dataframe, not with a series. In addition, you can test criteria on multiple series via pd.DataFrame.any
:
m1 = df['col1'].isin(list1)
m2 = df[['col2', 'col3', 'col4']].isin(list2).any(1)
df.loc[m1 & m2, 'col2'] = ''
Result:
print(df)
col1 col2 col3 col4
0 A B D
1 C E E A
2 E A E A
3 A D D
4 B B B B
5 D D D D
6 F F A F
7 E E E E
8 B B B B
Upvotes: 2