How to applying custom function mappings for dataframe failures

df = pd.DataFrame({
            'spid': ['AB01', 'AB02', 'AB03', 'AB04'],
            'value': [15,21,1758,12],
            'target': ['fan','fan','set','lan'],
            'test': ['Yes', 'No', 'No', 'Yes']
    })

and a mapping document in another dataframe

mapper = pd.DataFrame({
        'check': ["row['target'] == 'fan'","row['target'] == 'set'"],
        'message' : ['fan err message for [spid]','set err message for [spid]'],
        'to_change' : [('test','Nope'),('test','Save1')]
    })

The idea is to check for the condition listed in column check of the mapping document, if condition is met then add a row to the err_msg dataframe with the spid of that row and the message listed in the mapping document. And then do the changes listed in the to_change column.

Upvotes: 0

Views: 47

Answers (2)

Robert Szymański
Robert Szymański

Reputation: 11

The result when using the operations performed with the pandas library:

    import pandas as pd

    df = pd.DataFrame({
        'spid': ['AB01', 'AB02', 'AB03', 'AB04'],
        'value': [15,21,1758,12],
        'target': ['fan','fan','set','lan'],
        'test': ['Yes', 'No', 'No', 'Yes']
    })

    mapper = pd.DataFrame({
        'check': ["row['target'] == 'fan'","row['target'] == 'set'"],
        'message' : ['fan err message for [spid]','set err message for [spid]'],
        'to_change' : [('test','Nope'),('test','Save1')]
    })

    mapper['target']=pd.Series(['fan', 'set'])

    err_msg = pd.merge(df, mapper, on="target")
    err_msg['message'] = err_msg.apply(lambda row: row['message'].replace('[spid]', row['spid']), axis=1)
    err_msg = err_msg[['spid','message']]

    df['test2'] = pd.merge(df, mapper, on='target').apply(lambda row: row['to_change'][1] if row['test'] == 'Yes' else row['test'], axis=1)


    print(df)
    
       spid  value target test test2
    0  AB01     15    fan  Yes  Nope
    1  AB02     21    fan   No    No
    2  AB03   1758    set   No    No
    3  AB04     12    lan  Yes   NaN

    print(err_msg)

       spid                   message
    0  AB01  fan err message for AB01
    1  AB02  fan err message for AB02
    2  AB03  set err message for AB03

Upvotes: 1

I used a list of tuples to manage the to_change list. I evaluate the check filter in mapper resulting in a true or false match then use the dataframe loc to update the test column value for Yes and No values matches. A named tuple would be more readable

df = pd.DataFrame({
            'spid': ['AB01', 'AB02', 'AB03', 'AB04'],
            'value': [15,21,1758,12],
            'target': ['fan','fan','set','lan'],
            'test': ['Yes', 'No', 'No', 'Yes']
    })

mapper = pd.DataFrame({
        'check': ["row['target'] == 'fan'","row['target'] == 'set'"],
        'message' : ['fan err message for [spid]','set err message for [spid]'],
        'to_change' : [('test','Nope'),('test','Save1')]
    })

checks=mapper['check']
to_change_no=mapper['to_change'][0]
to_change_yes=mapper['to_change'][1]

filter=""
for check in checks:
    if filter=="":
        filter="("+check+")"
    else:
        filter+=" | " + "("+check+")"

for key,row in df.iterrows():
    result=eval(filter)
    if (result==True):
        if row['test']=='Yes':
            df.loc[key,'test']=to_change_yes[1]
        elif  row['test']=='No':
            df.loc[key,'test']=to_change_no[1]
        #print(row['test'])
        
print(df)

output:

 spid  value target   test
0  AB01     15    fan  Save1
1  AB02     21    fan   Nope
2  AB03   1758    set   Nope
3  AB04     12    lan    Yes

Upvotes: 0

Related Questions