Denis
Denis

Reputation: 99

Change values in columns of dataframe depending on values of other columns (values come from lists)

I have a Data Frame in python, for example this one:

  col1 col2 col3 col4
0    A    C    B    D
1    C    E    E    A
2    E    A    E    A
3    A    D    D    D
4    B    B    B    B
5    D    D    D    D
6    F    F    A    F
7    E    E    E    E
8    B    B    B    B

The code for the creation of the dataframe:

d = {'col1':['A','C','E','A','B','D','F','E','B'], 'col2':['C','E','A','D','B','D','F','E','B'],
              'col3':['B','E','E','D','B','D','A','E','B'], 'col4':['D','A','A','D','B','D','F','E','B']}
df = pd.DataFrame(data=d)

Let the list1 be ['A','C','E'] and list2 be ['B','D','F']. What I want is following: if in the col1 stays an element from the list1 and in one of the col2-col4 stays an element from the list2, then i want to eliminate the last one (so replace it by '').

I have tried df['col2'].loc[(df['col1'] in list1) & (df[['col2'] in list2)]='' which is not quite what i want but al least goes in the right direction, unfortunately it doesn't work. Could someone help please?

This is my expected output:

  col1 col2 col3 col4
0    A         B    D
1    C    E    E    A
2    E    A    E    A
3    A         D    D
4    B    B    B    B
5    D    D    D    D
6    F    F    A    F
7    E    E    E    E
8    B    B    B    B

Upvotes: 1

Views: 46

Answers (1)

jpp
jpp

Reputation: 164773

pd.DataFrame.loc is a method of pd.DataFrame, so use it with your dataframe, not with a series. In addition, you can test criteria on multiple series via pd.DataFrame.any:

m1 = df['col1'].isin(list1)
m2 = df[['col2', 'col3', 'col4']].isin(list2).any(1)

df.loc[m1 & m2, 'col2'] = ''

Result:

print(df)

  col1 col2 col3 col4
0    A         B    D
1    C    E    E    A
2    E    A    E    A
3    A         D    D
4    B    B    B    B
5    D    D    D    D
6    F    F    A    F
7    E    E    E    E
8    B    B    B    B

Upvotes: 2

Related Questions