Reputation: 391
This function filters/selects one or more variables from my dataset and writes it to a new CSV file. I'm getting an 'object not found' error when I call the function. Here is the function:
extract_ids <- function(filename, opp, ...) {
#Read in data
df <- read_csv(filename)
#Remove rows 2,3
df <- df[-c(1,2),]
#Filter and select
df_id <- filter(df, across(..., ~ !is.na(.x)) & gc == 1) %>%
select(...) #not sure if my use of ... here is correct
#String together variables for export file path
path <- c("/Users/stephenpoole/Downloads/",opp,"_",...,".csv") #not sure if ... here is correct
#Export the file
write_csv(df_id, paste(path,collapse=''))
}
And here is the function call. I'm trying to get columns "rid" and "cintid."
extract_ids(filename = "farmers.csv",
opp = "farmers",
rid, cintid)
When I run this, I get the below error:
Error: Problem with `filter()` input `..1`.
ℹ Input `..1` is `across(..., ~!is.na(.x)) & gc == 1`.
x object 'cintid' not found
The column cintid is correct and appears in the data. I've also tried running it with just one column, rid, and get the same 'object not found' error.
Upvotes: 3
Views: 3071
Reputation: 3326
Sorry for omitting this in my previous suggestion to you. Unfortunately, your original question was closed before I could post it as an answer:
If you want your function to resemble
dplyr
, here's a few modifications you can make. Write your function header asfunction(filename, opp, ...)
verbatim. Then, replace!is.na(ID)
withacross(..., ~ !is.na(.x))
verbatim. Now, you can callextract_ids()
and, just as you would with anydplyr
verb, you can specify any selection of columns you want to filter outNA
s:extract_ids(filename = "farmers.csv", opp = "farmers", rid, another_column_you_want_without_NAs)
.
As MrFlick rightly suggests in their comment, you should wrap ...
with c()
, so everything you pass in ...
is interpreted as the first argument to across()
: a single tidy-select
ion of columns from df
:
extract_ids <- function(filename, opp, ...) {
# ...
# Filter and select
df_id <- df %>%
# This format is preferred for dplyr workflows with pipes (%>%).
filter(across(c(...), ~ !is.na(.x)) & gc == 1) %>%
select(...)
# ...
}
Without this precaution, R interprets rid
and cintid
as multiple arguments to across()
, rather than as simply columns named by the first argument (the tidy-select
ion).
To get those variable names within your filepath, use
extract_ids <- function(filename, opp, ...) {
# ...
# Expand the '...' into a list of given variable names, which will get pasted.
path <- c("/Users/stephenpoole/Downloads/", opp, "_", match.call(expand.dots = FALSE)$`...`, ".csv")
# ...
}
though you might want to consider replacing match.call(expand.dots = FALSE)$`...`
, which currently mushes together the variable names:
"/Users/stephenpoole/Downloads/farmers_ridcintid.csv"
In exactly the same place, you might use the expression paste(match.call(expand.dots = FALSE)$`...`, collapse = "-")
, which will separate those variable names using -
"/Users/stephenpoole/Downloads/farmers_rid-cintid.csv"
or any other separator of your choice that gives a valid filename.
Upvotes: 2
Reputation: 206401
If you are passing multiple values to across()
, you need to collect them in the first parameter, otherwise they will spread into the other parameters of across()
. Try
filter(df, across(c(...), ~ !is.na(.x))
Otherwise every value other than the first one will be passed along as a parameter to function you've specified in across()
Upvotes: 3