Alex
Alex

Reputation: 15708

Is this a bug in group_by and lead/lag?

Example:

library(dplyr) # version 0.4.3

df <- 
    data.frame(hour = 0:11, minutes = runif(12, 0, 59), count = rpois(12, 3)) %>%
    arrange(hour, minutes)

df %>%
    group_by(hour) %>%
    mutate(diff = count - lag(count, default = max(count)))

raises an error:

Error: expecting a single value

The following raises a different error:

> df %>%
+     group_by(hour) %>%
+     mutate(diff = count - lag(count, default = count))
Error: not compatible with requested type

I feel like both should work and the answer should be data frame containing a diff column of zeros. This is because there is only one row per group, and I am expecting the default non-existent row value to be the maximum count in that group.

Upvotes: 3

Views: 162

Answers (1)

akrun
akrun

Reputation: 887128

The first error seems to be version specific, but the second one we can remove by selecting the first observation of 'count' or last one.

df %>%
   group_by(hour) %>%
   mutate(diff = count - lag(count, default = first(count)))

Upvotes: 3

Related Questions