Jed
Jed

Reputation: 399

How to create a column of percentages within a grouped dataframe?

I have created a frequency table, DF, using the code below. However I would also like to create a column of percentages/proportions within the table, to see the percentage/proportion of each Function for each key. I am not sure how to adapt my code to do this. Any advice and help would be appreciated!

  gather(key = 'key', value = 'freq', -Function) %>%
  mutate(freq = as.numeric(freq)) %>% 
  group_by(Function, key) %>% 
  summarise(freq=sum(freq)) ``` 

Upvotes: 0

Views: 63

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389215

Try using this :

library(dplyr)
df %>%
  tidyr::gather(key = 'key', value = 'freq', -Function) %>%
  mutate(freq = as.numeric(freq)) %>% 
  group_by(key, Function) %>% 
  summarise(freq=sum(freq)) %>% #..... (1)
  mutate(freq = freq/sum(freq))

Note that -

  • gather has been retired, so use pivot_longer instead.
  • The above works without grouping by key explicitly because when you do summarise at (1) only last level of grouping is dropped i.e Function, so data is still grouped by key at (1).

Upvotes: 1

Pablo B.
Pablo B.

Reputation: 121

If I understood your problem correctly, you can continue by grouping by key and the calculate the percentage/proportion

gather(key = 'key', value = 'freq', -Function) %>%
mutate(freq = as.numeric(freq)) %>% 
group_by(Function, key) %>% 
summarise(freq = sum(freq))  %>% 
group_by(key) %>%
mutate(prop = freq / sum(freq))

Upvotes: 0

Related Questions