Reputation: 4253
df = pd.DataFrame({
'spid': ['AB01', 'AB02', 'AB03', 'AB04'],
'value': [15,21,1758,12],
'target': ['fan','fan','set','lan'],
'test': ['Yes', 'No', 'No', 'Yes']
})
and a mapping document in another dataframe
mapper = pd.DataFrame({
'check': ["row['target'] == 'fan'","row['target'] == 'set'"],
'message' : ['fan err message for [spid]','set err message for [spid]'],
'to_change' : [('test','Nope'),('test','Save1')]
})
The idea is to check for the condition listed in column check of the mapping document, if condition is met then add a row to the
err_msg
dataframe with thespid
of that row and the message listed in the mapping document. And then do the changes listed in theto_change column
.
Upvotes: 0
Views: 47
Reputation: 11
The result when using the operations performed with the pandas library:
import pandas as pd
df = pd.DataFrame({
'spid': ['AB01', 'AB02', 'AB03', 'AB04'],
'value': [15,21,1758,12],
'target': ['fan','fan','set','lan'],
'test': ['Yes', 'No', 'No', 'Yes']
})
mapper = pd.DataFrame({
'check': ["row['target'] == 'fan'","row['target'] == 'set'"],
'message' : ['fan err message for [spid]','set err message for [spid]'],
'to_change' : [('test','Nope'),('test','Save1')]
})
mapper['target']=pd.Series(['fan', 'set'])
err_msg = pd.merge(df, mapper, on="target")
err_msg['message'] = err_msg.apply(lambda row: row['message'].replace('[spid]', row['spid']), axis=1)
err_msg = err_msg[['spid','message']]
df['test2'] = pd.merge(df, mapper, on='target').apply(lambda row: row['to_change'][1] if row['test'] == 'Yes' else row['test'], axis=1)
print(df)
spid value target test test2
0 AB01 15 fan Yes Nope
1 AB02 21 fan No No
2 AB03 1758 set No No
3 AB04 12 lan Yes NaN
print(err_msg)
spid message
0 AB01 fan err message for AB01
1 AB02 fan err message for AB02
2 AB03 set err message for AB03
Upvotes: 1
Reputation: 4253
I used a list of tuples to manage the to_change list. I evaluate the check filter in mapper resulting in a true or false match then use the dataframe loc to update the test column value for Yes and No values matches. A named tuple would be more readable
df = pd.DataFrame({
'spid': ['AB01', 'AB02', 'AB03', 'AB04'],
'value': [15,21,1758,12],
'target': ['fan','fan','set','lan'],
'test': ['Yes', 'No', 'No', 'Yes']
})
mapper = pd.DataFrame({
'check': ["row['target'] == 'fan'","row['target'] == 'set'"],
'message' : ['fan err message for [spid]','set err message for [spid]'],
'to_change' : [('test','Nope'),('test','Save1')]
})
checks=mapper['check']
to_change_no=mapper['to_change'][0]
to_change_yes=mapper['to_change'][1]
filter=""
for check in checks:
if filter=="":
filter="("+check+")"
else:
filter+=" | " + "("+check+")"
for key,row in df.iterrows():
result=eval(filter)
if (result==True):
if row['test']=='Yes':
df.loc[key,'test']=to_change_yes[1]
elif row['test']=='No':
df.loc[key,'test']=to_change_no[1]
#print(row['test'])
print(df)
output:
spid value target test
0 AB01 15 fan Save1
1 AB02 21 fan Nope
2 AB03 1758 set Nope
3 AB04 12 lan Yes
Upvotes: 0