Reputation: 180
When writing functions it is important to check for the type of arguments. For example, take the following (not necessarily useful) function which is performing subsetting:
data_subset = function(data, date_col) {
if (!TRUE %in% (is.character(date_col) | is.expression(date_col))){
stop("Input variable date is of wrong format")
}
if (is.character(date_col)) {
x <- match(date_col, names(data))
} else x <- match(deparse(substitute(date_col)), names(data))
sub <- data[,x]
}
I would like to allow the user to provide the column which should be extracted as character or expression (e.g. a column called "date" vs. just date). At the beginning I would like to check that the input for date_col is really either a character value or an expression. However, 'is.expression' does not work:
Error in match(x, table, nomatch = 0L) : object '...' not found
Since deparse(substitute)) works if one provides expressions I thought 'is.expression' has to work as well. What is wrong here, can anyone give me a hint?
Upvotes: 2
Views: 153
Reputation: 14997
I think you are not looking for is.expression
but for is.name
.
The tricky part is to get the type of date_col
and to check if it is of type character only if it is not of type name
. If you called is.character
when it's a name, then it would get evaluated, typically resulting in an error because the object is not defined.
To do this, short circuit evaluation can be used: In
if(!(is.name(substitute(date_col)) || is.character(date_col)))
is.character
is only called if is.name
returns FALSE
.
Your function boils down to:
data_subset = function(data, date_col) {
if(!(is.name(substitute(date_col)) || is.character(date_col))) {
stop("Input variable date is of wrong format")
}
date_col2 <- as.character(substitute(date_col))
return(data[, date_col2])
}
Of course, you could use if(is.name(…))
to convert only to character when date_col
is a name.
This works:
testDF <- data.frame(col1 = rnorm(10), col2 = rnorm(10, mean = 10), col3 = rnorm(10, mean = 50), rnorm(10, mean = 100))
data_subset(testDF, "col1") # ok
data_subset(testDF, col1) # ok
data_subset(testDF, 1) # Error in data_subset(testDF, 1) : Input variable date is of wrong format
However, I don't think you should do this. Consider the following example:
var <- "col1"
data_subset(testDF, var) # Error in `[.data.frame`(data, , date_col2) : undefined columns selected
col1 <- "col2"
data_subset(testDF, col1) # Gives content of column 1, not column 2.
Though this "works as designed", it is confusing because unless carefully reading your function's documentation one would expect to get col1
in the first case and col2
in the second case.
Abusing a famous quote:
Some people, when confronted with a problem, think “I know, I'll use non-standard evaluation.” Now they have two problems.
Hadley Wickham in Non-standard evaluation:
Non-standard evaluation allows you to write functions that are extremely powerful. However, they are harder to understand and to program with. As well as always providing an escape hatch, carefully consider both the costs and benefits of NSE before using it in a new domain.
Unless you expect large benefits from allowing to skip the quotes around the name of the column, don't do it.
Upvotes: 1