jarciniegas
jarciniegas

Reputation: 47

Filtering across multiple columns to get number of rows in R

I'm attempting to reproduce the following code across 36 seperate columns in a df. So instead of having to retype this code 36 times, how can I apply a function that will produce 36 different "Values" based on each different column? The columns in the df I need to apply them to are 41:77.

number_of_ppl_1 <- nrow(df %>%
  filter(percent_score_1 >= 80.0))

number_of_ppl_2 <- nrow(df %>%
  filter(percent_score_2 >= 80.0))

The data looks like this:

   percent_score_1 percent_score_2 
     90                80
     100               60
     60                90

In case it is not clear, I need to find out how many people in each column (percent_score) has greater than or equal to 80%.

Thanks!!

Upvotes: 0

Views: 205

Answers (4)

Ronak Shah
Ronak Shah

Reputation: 389135

In base R, you can use colSums and colMeans to get count and percentage respectively.

#Count
colSums(df[41:77] >= 80, na.rm = TRUE)

#Percentage
colMeans(df[41:77] >= 80, na.rm = TRUE)

Upvotes: 1

Martin Gal
Martin Gal

Reputation: 16988

You could use tidyverse:

library(stringr)
library(dplyr)

df %>% 
  summarise(across(41:77, ~ sum(.x >= 80))) %>% 
  rename_with(
    ~ paste0("number_of_ppl_", str_extract(.x, "\\d+")), 
    41:77
    )

Upvotes: 1

Greg Snow
Greg Snow

Reputation: 49650

You can do this with sapply using code like this:

percents <- sapply(mydf[41:77], function(x) mean(x >= 80)

If you want the count instead of the percent then replace mean with sum.

The tidyverse answer is probably something like this:

percents <- mydf %>%
  select(41:77) %>%
  map_dbl(~mean(.x >= 80)

Upvotes: 0

mlneural03
mlneural03

Reputation: 853

You may try with,

df[ , sapply(df, function(x) any(x>=80))]

Upvotes: 0

Related Questions