Bjc51192
Bjc51192

Reputation: 327

If statement to remap values

I'm trying to create an if statement that populates the tool type column with Legal if the Pay Category column is Expense.

However, it is flaging everything that contains Legal as Legal, regardless of Pay Category.

test={"Pay Category":["Indemnity","Indemnity","Indemnity","Indemnity","Expense","Expense","Expense","Medical"],"Description of Payment":["Legal","Legal","Legal","Legal","Legal","Legal","Frog","Legal",]}
test=pd.DataFrame(test)

test["Tool Type"]=""
if (test["Pay Category"]=="Medical") is not False: 
test["Tool Type"][test["Description of Payment"].str.contains("Pharmacy|Prescription|RX",case=False)]="Pharmacy"

if (test["Pay Category"]=='Expense') is not False:
test["Tool Type"][test["Description of Payment"].str.contains("Legal|Attorney|Court|Defense",case=False)]="Legal"

My understanding is, if (test["Pay Category"]=='Expense') is not False: is a boolean value which is either True or False, it should only execute the if statement if the criteria " is not False" is met. What am I missing?

Brandon

Upvotes: 1

Views: 73

Answers (1)

jezrael
jezrael

Reputation: 863226

I think you need add conditions and chain them with & (and):

test["Tool Type"]=""
m1 = test["Description of Payment"].str.contains("Pharmacy|Prescription|RX",case=False)
m2 = test["Pay Category"]=="Medical"

m3 = test["Description of Payment"].str.contains("Legal|Attorney|Court|Defense",case=False)
m4 = test["Pay Category"]=="Expense"

test.loc[m1 & m2, "Tool Type"]="Pharmacy"
test.loc[m3 & m4, "Tool Type"]="Legal"
print (test)
  Description of Payment Pay Category Tool Type
0                  Legal    Indemnity          
1                  Legal    Indemnity          
2                  Legal    Indemnity          
3                  Legal    Indemnity          
4                  Legal      Expense     Legal
5                  Legal      Expense     Legal
6                   Frog      Expense          
7                  Legal      Medical          

Another solution with double numpy.where:

test['Tool Type'] = np.where(m1 & m2, 'Pharmacy',
                    np.where(m3 & m4, 'Legal', ''))
print (test)
  Description of Payment Pay Category Tool Type
0                  Legal    Indemnity          
1                  Legal    Indemnity          
2                  Legal    Indemnity          
3                  Legal    Indemnity          
4                  Legal      Expense     Legal
5                  Legal      Expense     Legal
6                   Frog      Expense          
7                  Legal      Medical          

EDIT: Very nice solution with unutbu comment is use numpy.select

test['Tool Type']  = np.select([(m1 & m2), (m3 & m4)], ['Pharmacy', 'Legal'], default='')
print (test)
  Description of Payment Pay Category Tool Type
0                  Legal    Indemnity          
1                  Legal    Indemnity          
2                  Legal    Indemnity          
3                  Legal    Indemnity          
4                  Legal      Expense     Legal
5                  Legal      Expense     Legal
6                   Frog      Expense          
7                  Legal      Medical          

Upvotes: 3

Related Questions