nathaneastwood
nathaneastwood

Reputation: 3764

How does dplyr handle empty groups when displaying "groups" attribute?

Taking an example from the dplyr tests:

df <- data.frame(
  e = 1,
  f = factor(c(1, 1, 2, 2), levels = 1:3),
  g = c(1, 1, 2, 2),
  x = c(1, 2, 1, 4)
) %>%
  group_by(e, f, g, .drop = FALSE)

I don't quite understand why or how the "groups" attribute is defined as such

attr(df, "groups")
# # A tibble: 3 x 4
#       e f         g       .rows
#   <dbl> <fct> <dbl> <list<int>>
# 1     1 1         1         [2]
# 2     1 2         2         [2]
# 3     1 3        NA         [0]

The third row doesn't make any sense to me, it's not a valid group within the original data. I'd have thought the result would be:

# # A tibble: 3 x 4
#       e f         g       .rows
#   <dbl> <fct> <dbl> <list<int>>
# 1     1 1         1         [2]
# 2     1 2         2         [2]
# 3    NA 3        NA         [0]

Upvotes: 1

Views: 216

Answers (1)

akrun
akrun

Reputation: 887571

It is most likely due to recycling. It occurs in many functions

 data.frame(e = 1, b = c(2, 4), c = c(2, 3, 2, 4))

Here, the e value of 1 and 'b' values gets recycled. May be in the group attributes, the recycling happens only when there is a single unique value

Upvotes: 4

Related Questions