user3646834
user3646834

Reputation:

dplyr 0.7.0 tidyeval in packages

Preamble

I commonly use dplyr in my packages. Prior to 0.7.0, I was using the underscored versions of dplyr verbs to avoid NOTEs during R CMD CHECK. For example, the code:

x <- tibble::tibble(v = 1:3, w = 2)
y <- dplyr::filter(x, v > w)

would have yielded the R CMD CHECK NOTE:

* checking R code for possible problems ... NOTE
no visible binding for global variable ‘v’

By comparison, using the standard evaluation version:

y <- dplyr::filter_(x, ~v > w)

yielded no such NOTE.

However, in dplyr 0.7.0, the vignette Programming with dplyr says that the appropriate syntax for including dplyr functions in packages (to avoid NOTEs) is:

y <- dplyr::filter(x, .data$v > .data$w)

Consequently, the news file says that "the underscored version of each main verb is no longer needed, and so these functions have been deprecated (but remain around for backward compatibility)."

Question

The vignette says that the above new syntax will not yield R CMD check NOTES, "provided that you’ve also imported rlang::.data with @importFrom rlang .data." However, when I run the code:

y <- dplyr::filter(x, rlang::.data$v > rlang::.data$w)
Evaluation error: Object `From` not found in data.

Is this error similar to the following?

y <- dplyr::filter(x, v == dplyr::n())
Evaluation error: This function should not be called directly.

Namely, for some functions, calling them prefixed with the package yields errors? (Something to do with whether or not they've been exported, perhaps?)

Comment

As an aside, is there a less verbose way of writing package-friendly dplyr functions with the new syntax in 0.7.0? In particular, the syntax for dplyr >=0.7.0:

y <- dplyr::filter(x, .data$v > .data$w)

is more verbose than the syntax for dplyr <0.7.0:

y <- dplyr::filter_(x, ~v > w) 

and the verbosity increases as more variables are referenced. However, I don't want to use the less verbose syntax with the underscored version, as it is deprecated.

Upvotes: 13

Views: 606

Answers (2)

John Mount
John Mount

Reputation: 118

Another work-around is to add lines such as

v <- NULL; # mark as not an unbound global reference

just above your expressions that are generating CRAN checks. It is no less accurate (column names are not in fact global variables) and has somewhat limited scope.

Upvotes: 2

Lionel Henry
Lionel Henry

Reputation: 6803

for some functions, calling them prefixed with the package yields errors?

That's right, but we could make them work to make things more predictable. You can file a github issue for this feature.

is there a less verbose way of writing package-friendly dplyr functions with the new syntax in 0.7.0?

The alternative is to declare all your column symbols to R, e.g. within a globalVariables(c("v", "w")) statement somewhere in your package.

Ideally, R should know about NSE functions and never warn for unknown symbols in those cases.

Upvotes: 3

Related Questions