Rez99
Rez99

Reputation: 389

Unexpected behavior with dplyr and dates

I have a dataframe with some dates:

dates.df <- seq(from=as.Date("2019-01-01"), to=as.Date("2019-01-07"), by = "day") %>% 
data.frame(date=.) 
dates.df

# date
# 2019-01-01
# 2019-01-02
# 2019-01-03
# 2019-01-04
# 2019-01-05
# 2019-01-06
# 2019-01-07

I would like to create a second column which mirrors the date in the first column unless the date is before 2019-01-04, in which case it should show 2019-01-04 like so:

# date          date.prime
# 2019-01-01    2019-01-04
# 2019-01-02    2019-01-04
# 2019-01-03    2019-01-04
# 2019-01-04    2019-01-04
# 2019-01-05    2019-01-05
# 2019-01-06    2019-01-06
# 2019-01-07    2019-01-07

I have tried:

dates.df %>% 
mutate(date.prime=ifelse(date < "2019-01-04", "2019-01-04", date))

But this yields:

# date          date.prime
# 2019-01-01    2019-01-04
# 2019-01-02    2019-01-04
# 2019-01-03    2019-01-04
# 2019-01-04    17900
# 2019-01-05    17901
# 2019-01-06    17902
# 2019-01-07    17903

Any suggestions?

Upvotes: 0

Views: 108

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389047

First, in your attempt you are comparing date with character date ("2019-01-04") and not actual date object which might give you unexpected results.

class("2019-01-04")
#[1] "character"

For comparison to work correctly, you need to convert it to date

class(as.Date("2019-01-04"))
#[1] "Date"

According to this if we change your attempt , we get

library(dplyr)

dates.df %>% 
   mutate(date.prime = ifelse(date < as.Date("2019-01-04"), 
                              as.Date("2019-01-04"), date))


#        date date.prime
#1 2019-01-01      17900
#2 2019-01-02      17900
#3 2019-01-03      17900
#4 2019-01-04      17900
#5 2019-01-05      17901
#6 2019-01-06      17902
#7 2019-01-07      17903

that is because ifelse makes date loose their class.

To overcome that, we can convert it to Date again.

dates.df %>% 
   mutate(date.prime = as.Date(ifelse(date < as.Date("2019-01-04"),
                       as.Date("2019-01-04"), date)))

#       date date.prime
#1 2019-01-01 2019-01-04
#2 2019-01-02 2019-01-04
#3 2019-01-03 2019-01-04
#4 2019-01-04 2019-01-04
#5 2019-01-05 2019-01-05
#6 2019-01-06 2019-01-06
#7 2019-01-07 2019-01-07

Or use the suggestions in the comments, use if_else as mentioned by @Tung

dates.df %>% 
  mutate(date.prime = if_else(date < as.Date("2019-01-04"), 
                      as.Date("2019-01-04"), date))

Or pmax as suggested by @r2evans

dates.df %>% 
   mutate(date.prime = pmax(as.Date("2019-01-04"), date))

Upvotes: 2

Related Questions