Mansoor
Mansoor

Reputation: 1284

Using Variable in lag function of dplyr - R

I have written a function with variables. I am trying to compute lag for a given column of a dataframe. I am unable to do so. Following is my code snippet:

calculateLag <- function(df,lagCol,lagInterval){

  df <- df %>%
   group_by(grp = cumsum(c(TRUE, diff(t)!=1))) %>%

   mutate(val_lag = lag(df[,lagCol],lagInterval)) %>%
   ungroup() %>%
   select(-grp)

   return(df)
}

I am getting error as :

 Error in `[.data.table`(df, , lagCol) : 
 j (the 2nd argument inside [...]) is a single symbol but column name 'lagCol' is not found. Perhaps you intended DT[,..lagCol] or DT[,lagCol,with=FALSE]. This difference to data.frame is deliberate and explained in FAQ 1.1.    

Expected Result:

                   t         val   val_lag   val_lag2
 2005-01-17 17:30:00       14.3        NA         NA
 2005-01-17 18:30:00       14.0      14.3         NA
 2005-01-17 19:30:00       14.3      14.0       14.3
 2005-01-17 22:30:00       14.9        NA         NA
 2005-01-17 23:30:00       14.2      14.9         NA
 2005-01-18 00:30:00       14.1      14.2       14.9

Can someone help me in this?

Thanks

Upvotes: 0

Views: 469

Answers (1)

CPak
CPak

Reputation: 13591

A reproducible example would be helpful

Look at this example using mtcars

library(dplyr)
calculateLag <- function(df,lagCol,lagInterval){
  lagCol <- enquo(lagCol)    # need to quote
  df <- df %>%
         group_by(cyl) %>%
         mutate(val_lag = lag(!!lagCol, lagInterval)) %>%   # !! unquotes
         ungroup()
  return(df)
}

calculateLag(select(mtcars,cyl,gear), gear, 2)

See this link about non-standard-evaluation

With your data

calculateLag <- function(df,lagCol,lagInterval){
    lagCol <- enquo(lagCol)
    df <- df %>%
            group_by(grp = cumsum(c(TRUE, diff(t)!=1))) %>%
            mutate(val_lag = lag(!!lagCol, lagInterval)) %>%
            ungroup() %>%
            select(-grp)
    return(df)
}

calculateLag(df, val, 2)

Output with your data

                    t   val val_lag
1 2005-01-17 06:00:00  10.8      NA
2 2005-01-17 07:00:00  10.8      NA
3 2005-01-17 08:00:00  10.7    10.8
4 2005-01-17 09:00:00  10.6    10.8
5 2005-01-17 10:00:00  10.6    10.7
6 2005-01-17 11:00:00  10.7    10.6

Upvotes: 1

Related Questions