Reputation: 709

Using tidyr complete() with column names specified in variables

I am having trouble using the tidyr::complete() function with column names as variables.

The built-in example works as expected:

df <- data_frame(
 group = c(1:2, 1),
 item_id = c(1:2, 2),
 item_name = c("a", "b", "b"),
 value1 = 1:3,
 value2 = 4:6
)

df %>% complete(group, nesting(item_id, item_name))

However, when I try to provide the column names as character strings, it produces an error.

gr="group"
id="item_id"
name="item_name"
df %>% complete_(gr, nesting_(id, name),fill = list(NA))

Upvotes: 7

Answers (3)

Logit

Reputation: 113

Even a little more simply, df %>% complete(!!!syms(gr), nesting(!!!syms(id), !!!syms(name))) now gets it done in tidyr 1.0.2

Upvotes: 7

Wasabi

Reputation: 3071

Now that tidyr has adopted tidy evaluation, the underscore variants (i.e. complete_) have been deprecated since their behavior can be handled by the standard variants (complete).

However, complete, crossing and nesting use data-masking, so the way to convert variables into names is via the .data[[var]] pronoun (per the docs), so your case becomes:

suppressPackageStartupMessages(
  library(tidyr)
)

df <- data.frame(
  group = c(1:2, 1),
  item_id = c(1:2, 2),
  item_name = c("a", "b", "b"),
  value1 = 1:3,
  value2 = 4:6
)

gr <- "group"
id <- "item_id"
name <- "item_name"

df %>% complete(
  .data[[gr]],
  nesting(.data[[id]],
          .data[[name]])
)
#> # A tibble: 4 x 5
#>   group item_id item_name value1 value2
#>   <dbl>   <dbl> <fct>      <int>  <int>
#> 1     1       1 a              1      4
#> 2     1       2 b              3      6
#> 3     2       1 a             NA     NA
#> 4     2       2 b              2      5

^{Created on 2020-02-28 by the reprex package (v0.3.0)}

Not very elegant, but it gets the job done.

Upvotes: 2

alistaire

Reputation: 43364

I think it's a bug that complete_ can't work with data.frames or list columns like complete can, but here's a workaround using unite_ and separate to simulate nesting:

df %>% unite_('id_name', c(id, name)) %>% 
    complete_(c(gr, 'id_name')) %>% 
    separate(id_name, c(id, name))

## # A tibble: 4 × 5
##   group item_id item_name value1 value2
## * <dbl>   <chr>     <chr>  <int>  <int>
## 1     1       1         a      1      4
## 2     1       2         b      3      6
## 3     2       1         a     NA     NA
## 4     2       2         b      2      5

Upvotes: 2

Using tidyr complete() with column names specified in variables

Answers (3)

Related Questions