gh0strider18
gh0strider18

Reputation: 1140

How to create total frequency table using dplyr

I am getting an unexpected result when using dplyr to create a total relative frequency table and grouping by two variables. Here is an example:

set.seed(1234)
dat1 = data.frame(
  color = c(c(rep("red", 4), rep("green", 4))),
  type = c(c(rep(c(
    "big", "small"
  ), 4))),
  value = sample(1:6, 8, replace = T)
)
dat1 %>% group_by(color, type) %>% summarise(n = n()) %>%
   mutate(total = sum(n), rel.freq = n / total)

Here is the result of the preceding code:

# A tibble: 4 x 5
# Groups:   color [2]
  color type      n total rel.freq
  <fct> <fct> <int> <int>    <dbl>
1 green big       2     4    0.500
2 green small     2     4    0.500
3 red   big       2     4    0.500
4 red   small     2     4    0.500

However I would expect this:

# A tibble: 4 x 5
# Groups:   color [2]
  color type      n total rel.freq
  <fct> <fct> <int> <int>    <dbl>
1 green big       2     8    0.250
2 green small     2     8    0.250
3 red   big       2     8    0.250
4 red   small     2     8    0.250

Any insight into why the mutate on the dplyr pipe below is grouping only by the first grouping variable (or why it is grouping at all - my notion is that is should be working on the summarise() data set) would be greatly appreciated.

The total column should indicate that there are 8 cases in total (i.e., sum(n) from the previous result in summarise() should = 8).

Upvotes: 2

Views: 3913

Answers (1)

akrun
akrun

Reputation: 887118

After each summarise, one of the grouping elements will be dropped off i.e. the last group in that order. We need to ungroup after the summarise

dat1 %>% 
  group_by(color, type) %>% 
  summarise(n = n()) %>%
  ungroup %>% 
  mutate(total = sum(n), rel.freq = n / total)
# A tibble: 4 x 5
#  color type      n total rel.freq
#  <fct> <fct> <int> <int>    <dbl>
#1 green big       2     8     0.25
#2 green small     2     8     0.25
#3 red   big       2     8     0.25
#4 red   small     2     8     0.25

Upvotes: 5

Related Questions