Raven
Raven

Reputation: 859

IF ELSE Conditional Pandas

I have a bit of a puzzle which I similar to other questions but there is a slight twist.

I have a dataframe - see below. Each record is unique and some records have multiple admit locations in the Concat column. The CONCAT columns reflects the progression of a patients admissions location status.

I want to know where patients ended.

I know that if the text within the CONCAT column is '3E PICU' or '6EN' or '3MN' or '6E' or '3MC' regardless of any other text that is in the column, they ended in the ICU.

I know that if a patient had any of the following admit locations with the CONCAT column, WITHOUT any of the ICU locations they can be considered "ACUTE": '4E' or '5E NSU' or '3E HKU'(see code below for full list of locations).

I know that if a patient had APU or CPU or PSU regardless of any other location that is in the CONCAT column, they can be considered "Psych".

I know that if patient is not considered ICU or ACUTE or PSYCH, they were not admitted.

Current Data

 ID              Concat 
  1              MAIN, 3E HKU, 6EN 
  2              ED Eval and Treatment Unit
  3              ED Main, 3E PICU
  4              ED Main, APU

Desired Data

 ID              Concat                              Admit Status 
  1              MAIN, 3E HKU, 6EN                       ICU
  2              ED Eval and Treatment Unit            Non-Admit
  3              ED Main, PICU                           ICU
  4              ED Main, APU                           Psych
  5              ED Main, 5E NSU, 3E HKU                Acute

I am familiar with the str.contains code but I need some help in illogically mapping out the code, especially if if else conditions are required.

 condition_one=new_ADM1["concat"].str.contains("3E PICU|6EN|3MN|6E|3MC", case=False)
 condition_two=new_ADM1["concat"].str.contains("4E|5E NSU|3E HKU|3E|4MN|5E SCU|4MA|7E|7E IRU", case=False)
 condition_three=new_ADM1["concat"].str.contains("APU|CPU|PSU", case=False)

Upvotes: 1

Views: 98

Answers (1)

Shubham Sharma
Shubham Sharma

Reputation: 71707

Use, Series.str.contains along with the given regex patterns, then use np.select to select the items from choices based on the conditions m1, m2 & m3:

m1 = df["Concat"].str.contains("(?i)(?:3E PICU|6EN|3MN|6E|3MC)$")
m2 = df["Concat"].str.contains("(?i)(?:4E|5E NSU|3E HKU|3E|4MN|5E SCU|4MA|7E|7E IRU)$")
m3 = df["Concat"].str.contains("(?i)(?:APU|CPU|PSU)$")

df['Admit Status'] = np.select([m1, m2, m3], ['ICU', 'Acute', 'Psych'], 'Non-Admit')

Result:

# print(df)
   ID                      Concat Admit Status
0   1           MAIN, 3E HKU, 6EN          ICU
1   2  ED Eval and Treatment Unit    Non-Admit
2   3            ED Main, 3E PICU          ICU
3   4                ED Main, APU        Psych

Upvotes: 1

Related Questions