ARJ
ARJ

Reputation: 2080

How to append a third column in a dataframe using conditions from 2 other columns in a dataframe

I have the following data frame,

uuid    variable    value
AAS Highly_Active   False
AAS Highly_Active   True
SAP Highly_Active   False
SAP Multiple_days   True
YAS Highly_Active   False
YAS Highly_Active   False
YAS Busi_weekday    False

And I need to use the values from the column variable and value to define third column Activity and I have following classic python code doing it but my primary data frame is 121Mb in size and so its taking long time. Any pandas solutions would be great

def activity(row):
    if row['variable'] == "Highly_Active" and row['value'] ==True:
        val = "Highly_Active"    
    else:
        val = "NO"
    if row['variable'] == "Multiple_days" and row['value']==True:
        val = "Multiple_days"    
    else:
        val = "NO" 
    if row['variable'] == "Busi_weekday" and row['value']==True:
        val = "Busi_weekday"
    else:
        val="NO"
    return val

Upvotes: 0

Views: 121

Answers (2)

cs95
cs95

Reputation: 402814

KIS, np.where -

status = ["Highly_Active", "Multiple_days", "Busi_weekday"]

df['Activity'] = np.where(
  df['variable'].isin(status) & df['value'], 
  df['variable'], 
  'NO'
)

df

  uuid       variable  value       Activity
0  AAS  Highly_Active  False             NO
1  AAS  Highly_Active   True  Highly_Active
2  SAP  Highly_Active  False             NO
3  SAP  Multiple_days   True  Multiple_days
4  YAS  Highly_Active  False             NO
5  YAS  Highly_Active  False             NO
6  YAS   Busi_weekday  False             NO

If @Paul H is right and isin is redundant in the solution above, then you can just use pd.Series.where/pd.Series.mask -

df['variable'].where(df['value'], 'NO')

Or,

df['variable'].mask(~df['value'], 'NO')

df

  uuid       variable  value       Activity
0  AAS  Highly_Active  False             NO
1  AAS  Highly_Active   True  Highly_Active
2  SAP  Highly_Active  False             NO
3  SAP  Multiple_days   True  Multiple_days
4  YAS  Highly_Active  False             NO
5  YAS  Highly_Active  False             NO
6  YAS   Busi_weekday  False             NO

Upvotes: 3

BENY
BENY

Reputation: 323326

IIUC

df['Active']=(df.variable*df.value).replace('','No')
df
Out[653]: 
  uuid       variable  value         Active
0  AAS  Highly_Active  False             No
1  AAS  Highly_Active   True  Highly_Active
2  SAP  Highly_Active  False             No
3  SAP  Multiple_days   True  Multiple_days
4  YAS  Highly_Active  False             No
5  YAS  Highly_Active  False             No
6  YAS   Busi_weekday  False             No

Upvotes: 3

Related Questions