Reputation: 1
My data looks something like this but I have about 100+ rows:
ID | Grade | Race/Ethnicity |
---|---|---|
1 | 0 | White |
2 | 2 | Asian |
3 | 2 | Hispanic |
4 | 0 | Asian |
5 | 3 | Black |
6 | 1 | White |
7 | 2 | Hispanic |
8 | 1 | Black |
I want to calculate the percentage of students grouped by race/ethnicity who scored a grade >1 out of the total number of students who identify as that racial/ethnic group (i.e. how many white students scored a grade >1 out of the total number of white students in the class then repeat for all racial/ethnic categories). I want this with one row per race/ethnic category and the corresponding percent:
Race/Ethnicity | Percent |
---|---|
White | 0% |
Asian | 50% |
Hispanic | 100% |
Black | 50% |
I tried the code below:
data %>%
select(`Race/Ethnicity`) %>%
mutate(Percent = scales::label_percent()(ave(data$Grade>1, data$`Race/Ethnicity`, FUN=mean)))
However, it gives me the following output with the repeating values of each race/ethnicity, but I just want one row for each race/ethnicity category:
Race/Ethnicity | Percent |
---|---|
White | 0% |
Asian | 50% |
Hispanic | 100% |
Asian | 50% |
Black | 50% |
White | 0% |
Hispanic | 100% |
Black | 50% |
Upvotes: 0
Views: 286
Reputation: 8826
data <-
tibble::tribble(
~id, ~grade, ~race_ethnicity,
1L, 0L, "White",
2L, 2L, "Asian",
3L, 2L, "Hispanic",
4L, 0L, "Asian",
5L, 3L, "Black",
6L, 1L, "White",
7L, 2L, "Hispanic",
8L, 1L, "Black"
)
data %>%
group_by(race_ethnicity) %>%
summarise(percent = mean(grade > 1)) %>%
mutate(percent = scales::label_percent()(percent))
# A tibble: 4 x 2
race_ethnicity percent
<chr> <chr>
1 Asian 50%
2 Black 50%
3 Hispanic 100%
4 White 0%
Upvotes: 0