Reputation: 47
I have a data frame that looks like this:
soccer <- data.frame(
A=c("soccer1", "soccer1", "soccer2", "soccer2", "soccer3", "soccer3"),
game=c(1, 2, 1, 2, 1, 2),
number=c(1000, 1500, 200, 2100, 650, 1850)
)
I am trying to group by rows in column "A" with the same name (ie. soccer1), and then use the summarise function to show the ratio of # of fans at game 1/number of fans at game 1 and 2. I am stuck here:
soccer%>%
group_by(A)%>%
summarise(n=n[game==1]/n[game==2+game==1]*100)
I cannot figure out how to sum two numbers for the denominator.
Upvotes: 1
Views: 55
Reputation: 17195
You could try either of the approaches in the mutate
function, depending on if you want the raw (numeric) proportions or simply the percent (character):
library(dplyr)
soccer %>%
group_by(A) %>%
mutate(prop = number / sum(number), # for proportions (numeric)
perc = paste0(sprintf("%2.f", number / sum(number) * 100), "%")) # for percentage (character)
Output:
# A game number prop perc
# <chr> <dbl> <dbl> <dbl> <chr>
# 1 soccer1 1 1000 0.4 "40%"
# 2 soccer1 2 1500 0.6 "60%"
# 3 soccer2 1 200 0.0870 " 9%"
# 4 soccer2 2 2100 0.913 "91%"
# 5 soccer3 1 650 0.26 "26%"
# 6 soccer3 2 1850 0.74 "74%"
In base R, you could do this:
soccer$props <- unlist(tapply(soccer$number, soccer$A,
FUN = function(x) x / sum(x)))
# A game number props
#1 soccer1 1 1000 0.40000000
#2 soccer1 2 1500 0.60000000
#3 soccer2 1 200 0.08695652
#4 soccer2 2 2100 0.91304348
#5 soccer3 1 650 0.26000000
#6 soccer3 2 1850 0.74000000
Noting the comments, @Adam Quek proposes an elegant solution as well:
soccer %>% group_by(A) %>% mutate(prop = proportions(number))
Upvotes: 1