Reputation: 7733
I have 2 input columns such as Visit_start_date, Visit_type and one output column - Visit_end_date
I need to fill the values of Visit_end_date column based on the below condition
If visit_type column has values like "Outpatient" or "Emergency", then visit_end_date is same (equal) as visit_start_date. If it doesn't match, then I can just pick some random value from another column (Discharge Date)
I already tried this and unfortunately getting a number like '72842' as a value in visit_end_date column. It has to be in the date format.
Please note that Admit Date in dataframe is of the format "6/8/2169" and discharge date is of the format '85466.75694'
DF %>%
mutate(visit_start_date = as.Date(.$Admit_Date,format = "%m/%d/%Y")) %>% #dates in raw file are in the format as specified in the code
mutate(visit_end_date = ifelse((.$Enc_Type == 'Outpatient'|.$Enc_Type == 'Emergency'),visit_start_date, as.Date(.$Discharge_Date,origin ='1970-01-01')))}
I expect the output to be "2169-06-08" in both visit_start_date and visit_end_date column. Currently it is "2169-06-08" in visit_start_date and 72842 in visit_end_date as shown in screenshot below.
Please find the dput command output
structure(list(Admit_Date = c("6/8/2169 9:40", "6/8/2169 9:41", "6/8/2169 9:42", "7/24/2169 8:51", "9/12/2169 10:30", "6/19/2237 12:15", "6/19/2237 12:15", "6/19/2237 12:15", "4/27/2238 14:07", "4/27/2238 14:07" ), Discharge_Date = c(85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694, 85466.75694), Discharge_Disposition = c(NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_), Enc_Type = c("Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient"), Service = c("OVS / Eye", "Med / DM", "Med / DM", "Med / Renal", "Med / DM", "Dental / Dental Surg", "Dental / Dental Surg", "Dental / Dental Surg", "OVS / Eye", "OVS / Eye")), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))
Upvotes: 1
Views: 1735
Reputation: 145975
Here's how I would re-write your code. Note that you don't need to use .$
inside mutate
, you can just use column names. Also, you can run multiple statements inside a single mutate
.
With Date
class objects, ifelse
can drop the attributes necessary. dplyr::if_else
does a good job:
DF %>%
#dates in raw file are in the format as specified in the code
mutate(
visit_start_date = as.Date(Admit_Date, format = "%m/%d/%Y"),
Discharge_Date = as.Date(Discharge_Date, origin = '1970-01-01'),
visit_end_date = if_else(
Enc_Type %in% c('Outpatient', 'Emergency'),
visit_start_date,
Discharge_Date
)
)
# # A tibble: 10 x 7
# Admit_Date Discharge_Date Discharge_Disposit~ Enc_Type Service visit_start_date visit_end_date
# <chr> <date> <chr> <chr> <chr> <date> <date>
# 1 6/8/2169 9:40 2204-01-01 NA Outpati~ OVS / Eye 2169-06-08 2169-06-08
# 2 6/8/2169 9:41 2204-01-01 NA Outpati~ Med / DM 2169-06-08 2169-06-08
# 3 6/8/2169 9:42 2204-01-01 NA Outpati~ Med / DM 2169-06-08 2169-06-08
# 4 7/24/2169 8:~ 2204-01-01 NA Outpati~ Med / Renal 2169-07-24 2169-07-24
# 5 9/12/2169 10~ 2204-01-01 NA Outpati~ Med / DM 2169-09-12 2169-09-12
# 6 6/19/2237 12~ 2204-01-01 NA Outpati~ Dental / Den~ 2237-06-19 2237-06-19
# 7 6/19/2237 12~ 2204-01-01 NA Outpati~ Dental / Den~ 2237-06-19 2237-06-19
# 8 6/19/2237 12~ 2204-01-01 NA Outpati~ Dental / Den~ 2237-06-19 2237-06-19
# 9 4/27/2238 14~ 2204-01-01 NA Outpati~ OVS / Eye 2238-04-27 2238-04-27
# 10 4/27/2238 14~ 2204-01-01 NA Outpati~ OVS / Eye 2238-04-27 2238-04-27
I can't test this because you didn't provide sample input. If it doesn't work, please edit your question to include some sample data shared with dput
, e.g., dput(droplevels(DF[1:10, ]))
. Then I can test on that data and try to figure out what goes wrong.
Upvotes: 1