Calculate percentage in Dataframe according to a condition

Question

Assume I have the following Dataframe. I need to count the percentage of ages under 18, grouped by ID and Group. What I need is for example for 1 a 50% or for 3 a 0% I could do it in two steps by counting all and counting under 18 ages, then merge these two frames toghether, but I want to know if I could do it in one step.

a <- group_by(ID, Group ) %>% summarize(countAllData = n())
b <- group_by(ID, Group ) %>% filter(lebensalter < 18) %>%     summarize(countUnder18 = n())
merge(a, b, by=c("ID", "Group"), all=TRUE)
final[is.na(final)] <- 0 
percentageUnder18 = ((final$countUnder18/final$countAllData) * 100)
cbind(final, roundedPercentage)

Any suggestion?

ID Group Age
1  a      20
1  a      17 
1  b      16
2  c      23
2  c      11
2  d      12
3  e      20

G. Grothendieck · Accepted Answer

Take the mean of the indicator variable Age < 18. The last line is optional but the output in this example looks a bit better if you use it.

library(dplyr)

DF %>% 
   group_by(ID, Group) %>% 
   summarize("%Under18" = round(100 * mean(Age < 18))) %>% 
   ungroup %>%
   as.data.frame

giving:

  ID Group %Under18
1  1     a       50
2  1     b      100
3  2     c       50
4  2     d      100
5  3     e        0

Note

The input in reproducible form:

Lines <- "
ID Group Age
1  a      20
1  a      17 
1  b      16
2  c      23
2  c      11
2  d      12
3  e      20"
DF <- read.table(text = Lines, header = TRUE)

Calculate percentage in Dataframe according to a condition

Answers (2)

Note

Related Questions