nwillems1
nwillems1

Reputation: 23

Map multiple columns from pandas dataframe to a dictionary and conditionally return a value to a new column

I have a pandas dataframe with multiple columns and a dictionary with keys that correspond to the column names. I want to check the column values with respect to the dictionary values and return either a 'yes' or 'no' based on whether the column value meets a "greater than or equal to" condition.

Example:

import pandas as pd
dfdict = {'col1': [1,2,3], 'col2':[2,3,4], 'col3': [3.2, 4.2, 7.7]}
checkdict = {'col1': 2, 'col2': 3, 'col3': 1.5}
df = pd.DataFrame(dfdict)

For each column, for each row, check whether the row value is greater than or equal to than the value in the dictionary. For that row, if any of the columns meet the condition, return a "yes" to a newly created column, else return a "no".

What I've tried:

def checkcond(element):
    if not math.isnan(element):
        x = checkdict[element]
        return 1 if element >= x else 0
    else:
        pass

df['test'] = df.applymap(checkcond)

but of course this doesn't work because the row value is supplied to the checkcond function rather than the column name and row.

I also tried:

df['test'] = pd.np.where(df[['col1', 'col2', 'col3']].ge(0).any(1, skipna=True), 'Y', 'N')

But that will only take one value for the "ge" condition, whereas I want to check the row value with respect to the dictionary value for each of the columns.

Any suggestions would be appreciated!

Upvotes: 2

Views: 1097

Answers (1)

mozway
mozway

Reputation: 260640

Convert your dictionary to Series and perform a simple comparison:

df.ge(pd.Series(checkdict)).replace({True: 'yes', False: 'no'})

output:

  col1 col2 col3
0   no   no  yes
1   no   no  yes
2  yes  yes  yes

To get aggregation per row:

df['any'] = df.ge(pd.Series(checkdict)).any(1).map({True: 'yes', False: 'no'})

output:

   col1  col2  col3  any
0     1     2   3.2  yes
1     2     3   4.2  yes
2     3     4   7.7  yes

Upvotes: 3

Related Questions