haimen
haimen

Reputation: 2015

Group by proportions in R

The following is the data,

  library(dplyr)
  data(mtcars)

> mtcars %>% group_by(gear) %>% summarise(gear_count = n())
   A tibble: 3 x 2
   gear gear_count
  <dbl>      <int>
1     3         15
2     4         12
3     5          5

> mtcars %>% group_by(gear, vs) %>% summarise(gear_vs_count = n())
# A tibble: 6 x 3
# Groups:   gear [?]
   gear    vs gear_vs_count
  <dbl> <dbl>         <int>
1     3     0            12
2     3     1             3
3     4     0             2
4     4     1            10
5     5     0             4
6     5     1             1

I want to compile the following,

   gear    vs    gear_vs_count      gear_count      ratio
  <dbl> <dbl>         <int>
1     3     0            12            15            0.8
2     3     1             3            15            0.2
3     4     0             2            12            0.16  
4     4     1            10            12            0.84
5     5     0             4             5            0.8
6     5     1             1             5            0.2

One way to do this is though join. I am thinking there should be an easy way through dplyr. Can anybody please help me in doing this ?

Thanks

Upvotes: 0

Views: 67

Answers (1)

bouncyball
bouncyball

Reputation: 10761

We can use count and group_by.

mtcars %>%
    count(gear, vs) %>%
    group_by(gear) %>%
    mutate(gear_count = sum(n), ratio = n / sum(n))

#    gear    vs     n gear_count ratio
#   <dbl> <dbl> <int>      <int> <dbl>
# 1     3     0    12         15 0.8  
# 2     3     1     3         15 0.2  
# 3     4     0     2         12 0.167
# 4     4     1    10         12 0.833
# 5     5     0     4          5 0.8  
# 6     5     1     1          5 0.2  

If you want to change the column name of n to gear_vs_count, just pipe on rename('gear_vs_count' = 'n') to the end of the code.

Upvotes: 3

Related Questions