Julia Maddalena
Julia Maddalena

Reputation: 173

Calculating lead variable within mutate in R with dates

I am using dplyr's mutate function to create a POSIX date column of a data frame by taking the lead of another column. When I try to fill in the missing values in the lead function using a single date, I get an error:

> dates
# A tibble: 5 x 1
  orig_date          
  <dttm>             
1 2016-06-21 20:00:00
2 2016-07-09 22:00:00
3 2016-07-10 22:00:00
4 2016-07-20 21:00:00
5 2016-07-21 21:00:00

> fillin_date
[1] "2018-08-29 UTC"

> dates %>% mutate(next_date = lead(orig_date, 1, default = fillin_date))

Error in mutate_impl(.data, dots) : 
  Not compatible with requested type: [type=symbol; target=double].

This does not happen outside of mutate:

> lead(dates$orig_date, 1, default = fillin_date)
[1] "2016-07-09 22:00:00 UTC" "2016-07-10 22:00:00 UTC" "2016-07-20 21:00:00 UTC"
[4] "2016-07-21 21:00:00 UTC" "2018-08-29 00:00:00 UTC"

What is going wrong here?

Upvotes: 4

Views: 789

Answers (1)

Calum You
Calum You

Reputation: 15072

I am not sure as to the underlying reason why you can supply the symbol outside of mutate but not inside, but you can get around it by quoting and unquoting the variable. You can also save your date to fill in as character and just convert to date inside the mutate call.

library(tidyverse)
df <- tibble(orig_date = c("2016-06-21 20:00:00", "2016-07-09 22:00:00", "2016-07-10 22:00:00", "2016-07-20 21:00:00", "2016-07-21 21:00:00")) %>%
  mutate(orig_date = as.POSIXct(orig_date))

fillin_date <- as.POSIXct("2018-08-29")
fillin_date2 <- "2018-08-29"
df %>%
  mutate(next_date = lead(orig_date, 1, default = !!quo(fillin_date)))
#> # A tibble: 5 x 2
#>   orig_date           next_date          
#>   <dttm>              <dttm>             
#> 1 2016-06-21 20:00:00 2016-07-09 22:00:00
#> 2 2016-07-09 22:00:00 2016-07-10 22:00:00
#> 3 2016-07-10 22:00:00 2016-07-20 21:00:00
#> 4 2016-07-20 21:00:00 2016-07-21 21:00:00
#> 5 2016-07-21 21:00:00 2018-08-29 00:00:00
df %>%
  mutate(next_date = lead(orig_date, 1, default = as.POSIXct(fillin_date2)))
#> # A tibble: 5 x 2
#>   orig_date           next_date          
#>   <dttm>              <dttm>             
#> 1 2016-06-21 20:00:00 2016-07-09 22:00:00
#> 2 2016-07-09 22:00:00 2016-07-10 22:00:00
#> 3 2016-07-10 22:00:00 2016-07-20 21:00:00
#> 4 2016-07-20 21:00:00 2016-07-21 21:00:00
#> 5 2016-07-21 21:00:00 2018-08-29 00:00:00

Created on 2018-10-03 by the reprex package (v0.2.0).

Upvotes: 1

Related Questions