Conditional mapping to a dataframe based on multiple columns

Question

I have a dataframe where I need to map categories based on value-based conditions on two separate columns. Total rows to do this are about a million.

Sample dataframe is:

df = pd.DataFrame({'col1':['B','A','A','B','C','B','C','C','A'],
               'col2':[10,30,40,20,60,30,70,80,50]})

Now, the conditions for True are:

A: >30
B: >20
C: >60

If the value in col2 are as per above condition, then the result is True(1), else False(0).

Expected outcome is:

    col1    col2    result
0   B   10  0
1   A   30  0
2   A   40  1
3   B   20  1
4   C   60  0
5   B   30  1
6   C   70  1
7   C   80  1
8   A   50  1

jezrael · Accepted Answer

You can chain masks by | for bitwise OR:

df['result'] = (df['col1']=='A') & (df['col2']>30) |
               (df['col1']=='B') & (df['col2']>10) | 
               (df['col1']=='C') & (df['col2']>60)

Or:

df['result'] = np.where((df['col1']=='A') & (df['col2']>30) |
                        (df['col1']=='B') & (df['col2']>10) | 
                        (df['col1']=='C') & (df['col2']>60), 1, 0)

Conditional mapping to a dataframe based on multiple columns

Answers (2)

Related Questions