Neoleogeo
Neoleogeo

Reputation: 323

create a new summary variable if condition across many columns

I have a dataframe with an ID variable and a bunch of similarly named columns with information

+-------------------------------------------------
|         ID   | C1   |    C2     |   C3    | ...
+-------------------------------------------------
|          1   | 99   |  101      |   102   | ...
+-------------------------------------------------

I need to count the number of columns that fulfil certain condition (e.g. <100) If the number of columns was small I would do something like

df %>% mutate (counter= case_when(C1 <100 & C2<100 & C3<100 ~ "3",
                                  C1<100 & C2<100 ~ 2, ...)

But that is obviously not an option with 100+ columns. I Could also pivot, summarise and pivot back, but it also seems like not the cleanest solution. Any ideas of how to do this properly?

Upvotes: 0

Views: 102

Answers (1)

akrun
akrun

Reputation: 887193

We may use rowSums from base R on a logical matrix (df[-1] < 100) to get the count of elements in each row that are less than 100.

df$counter <- rowSums(df[-1] < 100, na.rm = TRUE)

TRUE -> 1 and FALSE -> 0, thus, when we take the row wise sum of logical matrix, each TRUE will be incremented as 1.


Or in a dplyr pipe

library(dplyr)
df %>%
    mutate(counter = rowSums(across(-1) < 100, na.rm = TRUE))

Upvotes: 1

Related Questions