The Great
The Great

Reputation: 7733

Generate date from another date based column using ifelse and Mutate in R

I have 2 input columns such as Visit_start_date, Visit_type and one output column - Visit_end_date

I need to fill the values of Visit_end_date column based on the below condition

If visit_type column has values like "Outpatient" or "Emergency", then visit_end_date is same (equal) as visit_start_date. If it doesn't match, then I can just pick some random value from another column (Discharge Date)

I already tried this and unfortunately getting a number like '72842' as a value in visit_end_date column. It has to be in the date format.

Please note that Admit Date in dataframe is of the format "6/8/2169" and discharge date is of the format '85466.75694'

DF %>% 
mutate(visit_start_date = as.Date(.$Admit_Date,format = "%m/%d/%Y")) %>%  #dates in raw file are in the format as specified in the code
mutate(visit_end_date = ifelse((.$Enc_Type == 'Outpatient'|.$Enc_Type == 'Emergency'),visit_start_date, as.Date(.$Discharge_Date,origin ='1970-01-01')))} 

I expect the output to be "2169-06-08" in both visit_start_date and visit_end_date column. Currently it is "2169-06-08" in visit_start_date and 72842 in visit_end_date as shown in screenshot below.

enter image description here

Please find the dput command output

structure(list(Admit_Date = c("6/8/2169 9:40", "6/8/2169 9:41", "6/8/2169 9:42", "7/24/2169 8:51", "9/12/2169 10:30", "6/19/2237 12:15", "6/19/2237 12:15", "6/19/2237 12:15", "4/27/2238 14:07", "4/27/2238 14:07" ), Discharge_Date = c(85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694), Discharge_Disposition = c(NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_), Enc_Type = c("Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient"), Service = c("OVS / Eye", "Med / DM", "Med / DM", "Med / Renal", "Med / DM", "Dental / Dental Surg", "Dental / Dental Surg", "Dental / Dental Surg", "OVS / Eye", "OVS / Eye")), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))

Upvotes: 1

Views: 1735

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145975

Here's how I would re-write your code. Note that you don't need to use .$ inside mutate, you can just use column names. Also, you can run multiple statements inside a single mutate.

With Date class objects, ifelse can drop the attributes necessary. dplyr::if_else does a good job:

DF %>%
  #dates in raw file are in the format as specified in the code
  mutate(
    visit_start_date = as.Date(Admit_Date, format = "%m/%d/%Y"),
    Discharge_Date = as.Date(Discharge_Date, origin = '1970-01-01'),
    visit_end_date = if_else(
      Enc_Type %in% c('Outpatient', 'Emergency'),
      visit_start_date,
      Discharge_Date
    )
  )
# # A tibble: 10 x 7
#    Admit_Date    Discharge_Date Discharge_Disposit~ Enc_Type Service       visit_start_date visit_end_date
#    <chr>         <date>         <chr>               <chr>    <chr>         <date>           <date>
#  1 6/8/2169 9:40 2204-01-01     NA                  Outpati~ OVS / Eye     2169-06-08       2169-06-08
#  2 6/8/2169 9:41 2204-01-01     NA                  Outpati~ Med / DM      2169-06-08       2169-06-08
#  3 6/8/2169 9:42 2204-01-01     NA                  Outpati~ Med / DM      2169-06-08       2169-06-08
#  4 7/24/2169 8:~ 2204-01-01     NA                  Outpati~ Med / Renal   2169-07-24       2169-07-24
#  5 9/12/2169 10~ 2204-01-01     NA                  Outpati~ Med / DM      2169-09-12       2169-09-12
#  6 6/19/2237 12~ 2204-01-01     NA                  Outpati~ Dental / Den~ 2237-06-19       2237-06-19
#  7 6/19/2237 12~ 2204-01-01     NA                  Outpati~ Dental / Den~ 2237-06-19       2237-06-19
#  8 6/19/2237 12~ 2204-01-01     NA                  Outpati~ Dental / Den~ 2237-06-19       2237-06-19
#  9 4/27/2238 14~ 2204-01-01     NA                  Outpati~ OVS / Eye     2238-04-27       2238-04-27
# 10 4/27/2238 14~ 2204-01-01     NA                  Outpati~ OVS / Eye     2238-04-27       2238-04-27

I can't test this because you didn't provide sample input. If it doesn't work, please edit your question to include some sample data shared with dput, e.g., dput(droplevels(DF[1:10, ])). Then I can test on that data and try to figure out what goes wrong.

Upvotes: 1

Related Questions