Stephen Poole
Stephen Poole

Reputation: 391

What is causing 'object not found' error in filter() with the across() function?

This function filters/selects one or more variables from my dataset and writes it to a new CSV file. I'm getting an 'object not found' error when I call the function. Here is the function:

    extract_ids <-  function(filename, opp, ...) {
  
  #Read in data
  df <- read_csv(filename)
  
  #Remove rows 2,3
  df <- df[-c(1,2),]
    
    #Filter and select
    df_id <- filter(df, across(..., ~ !is.na(.x)) & gc == 1) %>%
      select(...) #not sure if my use of ... here is correct
    
    #String together variables for export file path
    path <- c("/Users/stephenpoole/Downloads/",opp,"_",...,".csv") #not sure if ... here is correct
    
    #Export the file
    write_csv(df_id, paste(path,collapse=''))
    
  
}

And here is the function call. I'm trying to get columns "rid" and "cintid."

extract_ids(filename = "farmers.csv",
            opp = "farmers",
            rid, cintid)

When I run this, I get the below error:

 Error: Problem with `filter()` input `..1`.
ℹ Input `..1` is `across(..., ~!is.na(.x)) & gc == 1`.
x object 'cintid' not found

The column cintid is correct and appears in the data. I've also tried running it with just one column, rid, and get the same 'object not found' error.

Upvotes: 3

Views: 3071

Answers (2)

Greg
Greg

Reputation: 3326

Sorry for omitting this in my previous suggestion to you. Unfortunately, your original question was closed before I could post it as an answer:

If you want your function to resemble dplyr, here's a few modifications you can make. Write your function header as function(filename, opp, ...) verbatim. Then, replace !is.na(ID) with across(..., ~ !is.na(.x)) verbatim. Now, you can call extract_ids() and, just as you would with any dplyr verb, you can specify any selection of columns you want to filter out NAs: extract_ids(filename = "farmers.csv", opp = "farmers", rid, another_column_you_want_without_NAs).

Object Not Found

As MrFlick rightly suggests in their comment, you should wrap ... with c(), so everything you pass in ... is interpreted as the first argument to across(): a single tidy-selection of columns from df:

extract_ids <-  function(filename, opp, ...) {
  # ...

  # Filter and select
  df_id <- df %>%
    # This format is preferred for dplyr workflows with pipes (%>%).
    filter(across(c(...), ~ !is.na(.x)) & gc == 1) %>%
    select(...)

  # ...
}

Without this precaution, R interprets rid and cintid as multiple arguments to across(), rather than as simply columns named by the first argument (the tidy-selection).

Variable Names in the Filepath

To get those variable names within your filepath, use

extract_ids <-  function(filename, opp, ...) {
  # ...
  
  # Expand the '...' into a list of given variable names, which will get pasted.
  path <- c("/Users/stephenpoole/Downloads/", opp, "_", match.call(expand.dots = FALSE)$`...`, ".csv")

  # ...
}

though you might want to consider replacing match.call(expand.dots = FALSE)$`...`, which currently mushes together the variable names:

"/Users/stephenpoole/Downloads/farmers_ridcintid.csv"

In exactly the same place, you might use the expression paste(match.call(expand.dots = FALSE)$`...`, collapse = "-"), which will separate those variable names using -

"/Users/stephenpoole/Downloads/farmers_rid-cintid.csv"

or any other separator of your choice that gives a valid filename.

Upvotes: 2

MrFlick
MrFlick

Reputation: 206401

If you are passing multiple values to across(), you need to collect them in the first parameter, otherwise they will spread into the other parameters of across(). Try

filter(df, across(c(...), ~ !is.na(.x)) 

Otherwise every value other than the first one will be passed along as a parameter to function you've specified in across()

Upvotes: 3

Related Questions