Reputation: 47
I'm attempting to reproduce the following code across 36 seperate columns in a df. So instead of having to retype this code 36 times, how can I apply a function that will produce 36 different "Values" based on each different column? The columns in the df I need to apply them to are 41:77.
number_of_ppl_1 <- nrow(df %>%
filter(percent_score_1 >= 80.0))
number_of_ppl_2 <- nrow(df %>%
filter(percent_score_2 >= 80.0))
The data looks like this:
percent_score_1 percent_score_2
90 80
100 60
60 90
In case it is not clear, I need to find out how many people in each column (percent_score) has greater than or equal to 80%.
Thanks!!
Upvotes: 0
Views: 205
Reputation: 389135
In base R, you can use colSums
and colMeans
to get count and percentage respectively.
#Count
colSums(df[41:77] >= 80, na.rm = TRUE)
#Percentage
colMeans(df[41:77] >= 80, na.rm = TRUE)
Upvotes: 1
Reputation: 16988
You could use tidyverse
:
library(stringr)
library(dplyr)
df %>%
summarise(across(41:77, ~ sum(.x >= 80))) %>%
rename_with(
~ paste0("number_of_ppl_", str_extract(.x, "\\d+")),
41:77
)
Upvotes: 1
Reputation: 49650
You can do this with sapply
using code like this:
percents <- sapply(mydf[41:77], function(x) mean(x >= 80)
If you want the count instead of the percent then replace mean
with sum
.
The tidyverse answer is probably something like this:
percents <- mydf %>%
select(41:77) %>%
map_dbl(~mean(.x >= 80)
Upvotes: 0