Anand
Anand

Reputation: 71

Problem with dtplyr - `by` can't contain join column `V2`, `explicit` which is missing from RHS

I am using dtplyr for speeding operation on a large tibble and I am encountering a problem that I can't figure out. The following is a minimal example.

d <- tibble(
  rownnum = c(1L, 2L),
  stationID = c(1L, 2L),
  groupMemberID = c(0L, 0L),
  workCategory = factor(c("I", "A")),
  stationName = c("RW", "RW"),
  timeSpent = c(period(14060), period(3600)),
  time_grouping = c(as.POSIXct("2023-01-03 00:00:00"), as.POSIXct("2023-01-03 00:00:00")),
  time_grouping_label = c("2023-01-03", "2023-01-03")
)

d
# A tibble: 2 × 8
  rownnum stationID groupMemberID workCategory stationName timeSpent time_grouping              time_grouping_label
    <int>     <int>         <int> <fct>        <chr>       <Period>  <dttm>                     <chr>              
1       1         1             0 I            RW          14060S    2023-01-03 00:00:00.000000 2023-01-03         
2       2         2             0 A            RW          3600S     2023-01-03 00:00:00.000000 2023-01-03   


d <- d |>
  lazy_dt() |> 
  group_by(stationID, stationName,
           time_grouping, time_grouping_label,
           workCategory) |>
  summarize(cat_time = sum(timeSpent), cat_count = n(),
            .groups = "drop_last") |>
  mutate(tot_time = sum(cat_time), pct = 100 * cat_time/tot_time,
         tot_count = sum(cat_count)) |>
  ungroup() |>
  mutate(cat_time_period = seconds_to_period(cat_time),
         tot_time_period = seconds_to_period(tot_time)) |>
  complete(workCategory, nesting(stationID, stationName,
                                 time_grouping, time_grouping_label),
           fill = list(cat_time = 0, cat_count = 0, tot_time = 0, pct = 0,
                       tot_count = 0, cat_time_period = period(0),
                       tot_time_period = period(0)),
           explicit = FALSE) |>
  relocate(workCategory, .after = time_grouping_label) |>
  arrange(stationID, time_grouping, workCategory) |> 
  as_tibble()

Error in `common_by.list()` at dplyr/R/join-common-by.R:10:3:
! `by` can't contain join column `V2`, `explicit` which is missing from RHS.
Run `rlang::last_trace()` to see where the error occurred.

Any ideas as to what might be going wrong and how to fix?

I have tried various things like removing the fill spec from complete, replacing with expand etc, to no avail. Appreciate any help.

Upvotes: 0

Views: 52

Answers (0)

Related Questions