Reputation: 2219
Ultimately, I want to flag a new column, 'ExclusionFlag', with 0 or 1 depending on whether a value that is found in a list exists anywhere in the row.
df = pd.DataFrame([['cat','b','c','c'],['a','b','c','c'],['a','b','c','dog'],['dog','b','c','c']],columns=['Code1','Code2','Code3','Code4'])
excluded_codes = ['cat','dog']
#Attempt
df['ExclusionFlag'] = df.apply(lambda x: 'cat' in x.values, axis=1).any()
#Desired outcome
#Side note: I have 120 rows to check. They're labeled Code1, Code2...Code120.
Code1 Code2 Code3 Code4 ExclusionFlag
0 cat b c c 1
1 a b c c 0
2 a b c dog 1
3 dog b c c 1
The line of code I have marks every row as True.
And when I add the excluded_codes list in for 'cat' in my lambda expression, I get an error.
I found a couple questions like this, but what I see typically is something along the lines of (see below), which calls out a specific column, but I don't think iterating through 120 columns is the best approach. Although I could be wrong.
df['ExclusionFlag'] = df['Code1'].isin(exclusion_codes)
Upvotes: 4
Views: 6325
Reputation: 99
If you want True or False values in new column, you can check them without Any and Astype.
df['ExclusionFlag'] = df.isin(excluded_codes)
And you can check specific column also:
df['ExclusionFlag'] = df['Code2'].isin(excluded_codes)
Upvotes: 2
Reputation: 38415
Something like
df['ExclusionFlag'] = df.isin(excluded_codes).any(1).astype(int)
Code1 Code2 Code3 Code4 ExclusionFlag
0 cat b c c 1
1 a b c c 0
2 a b c dog 1
3 dog b c c 1
Upvotes: 11