phaser
phaser

Reputation: 625

dplyr: function error message

I created a function to subtract some of my data based on an id. The function was working fine until dplyr updated. Initially the function wasn't accepting the column name as input in the function. I used Programming with dplyr to adjust the function to accept the column name, however I am now getting a new error message.

testdf <- structure(list(date = c("2016-04-04", "2016-04-04", "2016-04-04", 
                        "2016-04-04", "2016-04-04", "2016-04-04"), sensorheight = c(1L, 
                                                                                    16L, 1L, 16L, 1L, 16L), farm = c("McDonald", "McDonald", 
                                                                                                                     "McDonald", "McDonald", "McDonald", "McDonald"
                                                                                    ), location = c("4", "4", "5", "5", "Outside", "Outside"), Temp = c(122.8875, 
                                                                                                                                                        117.225, 102.0375, 98.3625, 88.5125, 94.7)), .Names = c("date", 
                                                                                                                                                                                                                "sensorheight", "farm", "location", "Temp"), row.names = c(NA, 
                                                                                                                                                                                                                                                                           6L), class = "data.frame")


DailyInOutDiff <- function (df, variable) {

  DailyInOutDiff04 <- df %>%
    filter(location %in% c(4, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else !!variable[location=="4"] - !!variable[location=='Outside'], 
              location = "4")  %>%
    select(1, 2, 3, 5, 4)

  DailyInOutDiff05 <- df %>%
    filter(location %in% c(5, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else !!variable[location=="5"] - !!variable[location=='Outside'], 
              location = "5")  %>%
    select(1, 2, 3, 5, 4)

  temp.list <- list(DailyInOutDiff04, DailyInOutDiff05)
  final.df = bind_rows(temp.list)
  return(final.df)
}

test <- DailyInOutDiff(testdf, quo(Temp))

I would like to know what the error message means and how to fix it.

 Error in location == "4" : 
    comparison (1) is possible only for atomic and list types 

Upvotes: 2

Views: 195

Answers (1)

aosmith
aosmith

Reputation: 36076

I think the precedence of ! is causing problems. When that happens it looks like UQ should be used in place of !!.

In that case, the first part of your function would look like

DailyInOutDiff <- function (df, variable) {

    variable = enquo(variable)

    df %>%
        filter(location %in% c(4, 'Outside')) %>% 
        group_by(date, sensorheight, farm) %>%
        arrange(sensorheight, farm, location) %>%
        summarise(Diff = if(n()==1) NA else UQ(variable)[location == "4"] - 
                    UQ(variable)[location == "Outside"], 
                location = "4")

}

This now runs without error.

DailyInOutDiff(testdf, Temp)

        date sensorheight     farm   Diff location
       <chr>        <int>    <chr>  <dbl>    <chr>
1 2016-04-04            1 McDonald 34.375        4
2 2016-04-04           16 McDonald 22.525        4

I think using UQ is probably the best way to do this. Another alternative is to use the extract brackets in the form of a function. This also bypasses the precedence problem.

For example, code that looks like

!!variable[location == "4"]

can be rewritten as

`[`(!!variable, location == "4")

Making these changes to the first part of your function, things would look like

DailyInOutDiff <- function (df, variable) {

    variable = enquo(variable)

    df %>%
        filter(location %in% c(4, 'Outside')) %>% 
        group_by(date, sensorheight, farm) %>%
        arrange(sensorheight, farm, location) %>%
        summarise(Diff = if(n()==1) NA else `[`(!!variable, location == "4") - 
                    `[`(!!variable, location == "Outside"), 
                location = "4")

}

Which also runs without error

DailyInOutDiff(testdf, Temp)

        date sensorheight     farm   Diff location
       <chr>        <int>    <chr>  <dbl>    <chr>
1 2016-04-04            1 McDonald 34.375        4
2 2016-04-04           16 McDonald 22.525        4

Upvotes: 2

Related Questions