larnsce
larnsce

Reputation: 113

How to use lubridate as_datetime function in combination with dplyr mutate and case_when functions?

I am trying to manipulate a dttm variable to adjust for different time zones depending on a numeric id vector. I can manipulate the variable depending on the id without issues using a character vector as the new value. However, when I try to use the date_time() function to create the new value, each value receives the result of the first item in case_when.

The id vector is numeric and I have tried to convert the class to a factor and character. The issue pertains.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date

df1 <- tibble(
  id = c(1, 2, 3),
  date_time = rep(as_datetime("2018-01-01 12:34:56", tz = "Europe/Zurich"), 3)
) %>% 
  mutate(
    date_time2 = case_when(
      id == 1 ~ "one",
      id == 2 ~ "two",
      TRUE ~ "three"
    )
  )


df2 <- tibble(
  id = c(1, 2, 3),
  date_time = rep(as_datetime("2018-01-01 12:34:56", tz = "Europe/Zurich"), 3)
) %>% 
  mutate(
    date_time2 = case_when(
      id == 1 ~ as_datetime(date_time, tz = "America/New_York"),
      id == 2 ~ as_datetime(date_time, tz = "Asia/Kolkata"),
      TRUE ~ date_time
    )
  )

df3 <- tibble(
  id = c(1, 2, 3),
  date_time = rep(as_datetime("2018-01-01 12:34:56", tz = "Europe/Zurich"), 3)
) %>% 
  mutate(
    date_time2 = case_when(
      id == 1 ~ as_datetime(date_time, tz = "Asia/Kolkata"),
      id == 2 ~ as_datetime(date_time, tz = "America/New_York"),
      TRUE ~ date_time
    )
  )


df1 
#> # A tibble: 3 x 3
#>      id date_time           date_time2
#>   <dbl> <dttm>              <chr>     
#> 1     1 2018-01-01 12:34:56 one       
#> 2     2 2018-01-01 12:34:56 two       
#> 3     3 2018-01-01 12:34:56 three

df2
#> # A tibble: 3 x 3
#>      id date_time           date_time2         
#>   <dbl> <dttm>              <dttm>             
#> 1     1 2018-01-01 12:34:56 2018-01-01 06:34:56
#> 2     2 2018-01-01 12:34:56 2018-01-01 06:34:56
#> 3     3 2018-01-01 12:34:56 2018-01-01 06:34:56

df3
#> # A tibble: 3 x 3
#>      id date_time           date_time2         
#>   <dbl> <dttm>              <dttm>             
#> 1     1 2018-01-01 12:34:56 2018-01-01 17:04:56
#> 2     2 2018-01-01 12:34:56 2018-01-01 17:04:56
#> 3     3 2018-01-01 12:34:56 2018-01-01 17:04:56

Created on 2019-03-26 by the reprex package (v0.2.1)

df1 shows what I expect.

In df2, I expect date_time2 at id == 2 to show "2018-01-01 17:04:56" and not "2018-01-01 06:34:56".

In df3, I expect date_time2 at id == 3 to show "2018-01-01 12:34:56" and not "2018-01-01 17:04:56".

Upvotes: 4

Views: 1172

Answers (2)

Cettt
Cettt

Reputation: 11981

this seems to a bug (probably of dpylr, since there had been issues with dates before).

Here is a possible work arround (don't ask me why it works :))

tibble(
  id = c(1, 2, 3),
  date_time = rep(as_datetime("2018-01-01 12:34:56", tz = "Europe/Zurich"), 3)
) %>% 
  mutate(
    date_time2 = case_when(
      id == 1 ~ as_datetime(as.character(as_datetime(date_time, tz = "America/New_York"))),
      id == 2 ~ as_datetime(as.character(as_datetime(date_time, tz = "Asia/Kolkata"))),
      TRUE ~  as_datetime(as.character(date_time))

    )
  )

# A tibble: 3 x 3
     id date_time           date_time2         
  <dbl> <dttm>              <dttm>             
1     1 2018-01-01 12:34:56 2018-01-01 06:34:56
2     2 2018-01-01 12:34:56 2018-01-01 17:04:56
3     3 2018-01-01 12:34:56 2018-01-01 12:34:56

Upvotes: 1

www
www

Reputation: 39154

We can use force_tzs from the lubridate package. We can provide different time zone settings to the tzones argument. In this case, case_when is not required if you know the order of time zones.

library(dplyr)
library(lubridate)

df2 %>%
  mutate(date_time2 = force_tzs(date_time, tzones = c("America/New_York", "Asia/Kolkata", "UTC")))
# # A tibble: 3 x 3
#      id date_time           date_time2         
#   <dbl> <dttm>              <dttm>             
# 1     1 2018-01-01 12:34:56 2018-01-01 17:34:56
# 2     2 2018-01-01 12:34:56 2018-01-01 07:04:56
# 3     3 2018-01-01 12:34:56 2018-01-01 12:34:56

df3 %>%
  mutate(date_time2 = force_tzs(date_time, tzones = c("Asia/Kolkata", "America/New_York", "UTC")))
# # A tibble: 3 x 3
#      id date_time           date_time2         
#   <dbl> <dttm>              <dttm>             
# 1     1 2018-01-01 12:34:56 2018-01-01 07:04:56
# 2     2 2018-01-01 12:34:56 2018-01-01 17:34:56
# 3     3 2018-01-01 12:34:56 2018-01-01 12:34:56

Upvotes: 1

Related Questions