Marc Asensio Manjon
Marc Asensio Manjon

Reputation: 121

Creating factor variable from date variable in R

first time asking something here!

I have a dataset that recorded the starting date on which respondents took the survey. The variables are these two:

Dataframe =structure(list(StartDate = c(13779297307, 13780588700, 13778275992, 
                         13779183693, 13777765681, 13777812963, 13777739454, 13777734136, 
                         13777851398, 13777848412, 13777844822, 13777752646, 13778084770, 
                         13779035907, 13777740733), EndDate = c(13779299879, 13780589841, 
                                                                13778281191, 13779184505, 13777766537, 13777816641, 13777741146, 
                                                                13777737284, 13777854525, 13777850410, 13777847624, 13777754313, 
                                                                13778086732, 13779036482, 13777742186)), row.names = c(NA, 15L
                                                                ), class = "data.frame")

As the original file came from a .sav, I applied the necessary transformations for making it a 'Date'.

Dataframe$StartDate <- as.POSIXct(Dataframe$StartDate, origin = "1582-10-14", tz="GMT")
Dataframe$StartDate <- format(as.POSIXct(Dataframe$StartDate,format='%m/%d/%Y %H:%M:%S'),format='%m/%d/%Y')

What I want to do is transforming (or creating another one) the variable into one that classifies respondents on the moment they took the survey. So before 3rd of June - No reminder group, between 3rd of June and before 14th of June - 1st reminder and finally, after 14th of June - 2nd reminder. However, it seems that the variable I created does not follow a time logic and assigns all observations to 2nd reminder. I tried to apply numeric transformations, following some recommendations in this website, but it did not work.

    Dataframe <- Dataframe %>% 
mutate(StartDate = case_when(
        StartDate < 06/03/2019 ~ "No reminder",
        StartDate > 06/02/2019 & StartDate < 06/14/2019 ~ "1st reminder",
        StartDate > 06/13/2019 ~ "2nd reminder",
      ))

Then I guess the problem is related with the date variable itself and so, how can I have a date variable that follows time logic?

Thanks in advance for the help & time.

Upvotes: 1

Views: 483

Answers (1)

Andrew Chisholm
Andrew Chisholm

Reputation: 6567

Change the dates to be strings

Dataframe2 <- Dataframe %>% 
    mutate(StartDate = case_when(
        StartDate < "06/03/2019" ~ "No reminder",
        StartDate > "06/02/2019" & StartDate < "06/14/2019" ~ "1st reminder",
        StartDate > "06/13/2019" ~ "2nd reminder",
    ))

This is because R evaluates the expression without strings as a number and the comparison with dates still works

06/03/2019
[1] 0.0009905894

Upvotes: 1

Related Questions