tfkLSTM
tfkLSTM

Reputation: 181

Standard and non standard evaluation in Dplyr

Dear colleagues I am trying to build a function that interpolates linearly data in a dataframe:

The code looks as follows:

Linear_Interpolation <- function(df, min_ts, max_ts, target_column, signal_key) {
  if (exists(deparse(substitute(df))) == TRUE) {
    if (nrow(df) != 0) {
      vector.sequences <- seq(from = min_ts,
                              to = max_ts,
                              by = "hour")

      df.interpolation.aux <- data.table(snsr_ts = vector.sequences)
      df.interpolated <- bind_rows(df, df.interpolation.aux) %>% 
        arrange(., snsr_ts)

      df.duplicates <- which(duplicated((df.interpolated$snsr_ts)))
      df.interpolated <- df.interpolated[-df.duplicates,] %>%
        mutate_(., column = na.approx(column)) %>%
        mutate(., snsr_dt = as.Date(snsr_ts)) %>%
        mutate(., package = aux1$package) %>%
        rename_at(snsr_val = column) %>% 
        mutate(snsr_key = signal_key) %>%
        mutate(locf_tag='N') %>% 
        mutate(qlty_good_ind=ifelse(is.na(qlty_good_ind)==TRUE, 'Y', qlty_good_ind)) %>%
        mutate(qlty_interp=ifelse(is.na(qlty_interp)==TRUE, -3, qlty_interp))
    }
  } else {
    df.interpolated <- NULL
  }
  return(df.interpolated)
}

As I am using dplyr I am aware that I can not use standard evaluation. However when I was trying with mutate_ I got the message that now this feature is deprecated. Therefore I tried following the https://dplyr.tidyverse.org/articles/programming.html guide and using the following version:

Linear_Interpolation <- function(df, min_ts, max_ts, target, signal_key) {
  if (exists(deparse(substitute(df))) == TRUE) {
    if (nrow(df) != 0) {
      target <- enquo(target)
      signal_key <- enquo(signal_key)
      vector.sequences <- seq(from = min_ts,
                              to = max_ts,
                              by = "hour")

      df.interpolation.aux <- data.table(snsr_ts = vector.sequences)
      df.interpolated <- bind_rows(df, df.interpolation.aux) %>% 
        arrange(., snsr_ts)

      df.duplicates <- which(duplicated((df.interpolated$snsr_ts)))
      df.interpolated <- df.interpolated[-df.duplicates,] %>%
        mutate(snsr_val = na.approx(!!target)) %>%
        mutate(snsr_dt = as.Date(snsr_ts)) %>%
        mutate(., package = aux1$package) %>
        mutate(snsr_key = !!signal_key) %>%
        mutate(locf_tag='N') %>% 
        mutate(qlty_good_ind=ifelse(is.na(qlty_good_ind)==TRUE, 'Y', qlty_good_ind)) %>%
        mutate(qlty_interp=ifelse(is.na(qlty_interp)==TRUE, -3, qlty_interp))
    }
  } else {
    df.interpolated <- NULL
  }
  return(df.interpolated)
}

However I am getting the following result:

   df.interpolated.final <- Linear_Interpolation(df, min(df$snsr_ts), max(df$snsr_ts), "column_name", "71")
    Error in xy.coords(x, y, setLab = FALSE) : 
      'pairlist' object cannot be coerced to type 'double'
    In addition: Warning message:
    In is.na(y) :
     Error in xy.coords(x, y, setLab = FALSE) : 
      'pairlist' object cannot be coerced to type 'double' 
    > 

I have the feeling the target is being reading as a text in the na_approx function even I have not been able to debug it fully. The input dataframe is as follows:

snsr_dt package value   snsr_ts locf_tag    db_src  qlty_interp qlty_good_ind
8/26/2011   589 0   8/26/11 12:00   N   2   1   Y
10/4/2013   589 147 10/4/13 0:00    N   2   1   Y
10/17/2014  589 160 10/17/14 0:00   N   2   1   Y
11/14/2015  589 168 11/14/15 0:00   N   2   1   Y
12/28/2016  589 198 12/28/16 0:00   N   2   1   Y
1/10/2018   589 215 1/10/18 0:00    N   2   1   Y
1/4/2019    589 238 1/4/19 0:00 N   2   1   Y

Does someone know what is going on?

Upvotes: 0

Views: 218

Answers (1)

Artem Sokolov
Artem Sokolov

Reputation: 13691

Because you're wanting to accept strings and symbols, the proper verb is ensym(), not enquo(). Here's a minimal reproducible example that you can adapt to your larger application:

library( tidyverse )
library( lubridate )

Linear_Interpolation <- function( df, target ) {
    vseq <- seq( from=min(df$date), to=max(df$date), by="day")
    tibble(date = vseq) %>% 
        left_join(df, by="date") %>%
        mutate( snsr_val = zoo::na.approx(!!ensym(target)) )    # <--- ensym
}

## Data
X <- tibble( date = mdy("10/4/2013", "10/7/2013", "10/12/2013"),
             value = c(0, 147, 160) )

Linear_Interpolation( X, "value" )   # Works on strings
Linear_Interpolation( X, value )     #   ...and symbols
#   date       value snsr_val
#   <date>     <dbl>    <dbl>
# 1 2013-10-04     0       0 
# 2 2013-10-05    NA      49 
# 3 2013-10-06    NA      98 
# 4 2013-10-07   147     147 
# 5 2013-10-08    NA     150.
# 6 2013-10-09    NA     152.
# 7 2013-10-10    NA     155.
# 8 2013-10-11    NA     157.
# 9 2013-10-12   160     160 

Upvotes: 0

Related Questions