Reputation: 167
I want to highlight all cells in a pandas dataframe column that fail this validity check the color 'red'. Here is my code. In my implementation the whole column email is highlighted red instead of individual cells.
# raw data
df = pd.DataFrame({'Username' : ['arenzo', 'brenzo', 'crenzo', 'drenzo'],
'Email' : ['[email protected]', '[email protected]', '[email protected]', '[email protected]']})
# email validity function
def emailcheck (df):
validcode = (df['Email'].str.contains('@')) & (df['Email'].str.contains('.org', case= False) & (df['Email'].str.contains('sales', case=False)))
return validcode
def highlight_email(s):
if emailcheck(df).all():
color = ''
else:
color = 'red'
return 'background-color: %s' % color
df.style.applymap(highlight_email, subset=pd.IndexSlice[:, ['Email']])
# dataframe
Username Email
arenzo [email protected]
brenzo [email protected]
crenzo [email protected]
drenzo [email protected]
# last 2 rows under email column should be highlighted red
Upvotes: 2
Views: 1583
Reputation: 9274
You're currently checking the entire dataframe against your condition
validcode = (df['Email'].str.contains('@')) & (df['Email'].str.contains('.org', case= False) & (df['Email'].str.contains('sales', case=False)))
then you are testing if all values above are True in your if statement
if emailcheck(df).all():
color = ''
This will always return false from your data because some values in the data meet your criteria while others dont. Since the evaluation of that if statement is false, your function will return the color red for every cell. Instead just keep the second function and test against the individual value.
def highlight_email(s):
if '@' in s and '.org' in s and 'sales' in s:
color = ''
else:
color = 'red'
return 'background-color: {c}'.format(c=color)
Upvotes: 1