giegie
giegie

Reputation: 463

Count values above 0 and count how many match a pattern in a row (in R)

I would like to count how many rows in each column are >0 and how many of those rows (that are >0) start with "mt-". The result should also be in a data frame. Here is an example.

df1

mt-abc 1  0  2
mt-dca 1  1  2
cla    0  2  0
dla    0  3  0

result

above0 2  3  2
mt     2  1  2

Upvotes: 0

Views: 648

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389285

In base R you can do :

mat <- df[-1] > 0

rbind(above0 = colSums(mat), 
      mt = colSums(startsWith(df$V1, 'mt') & mat))

#       V2 V3 V4
#above0  2  3  2
#mt      2  1  2

Actual data has numbers in the column and names in rownames for which we can do :

mat <- df > 0

rbind(above0 = colSums(mat), 
      mt = colSums(startsWith(rownames(df), 'mt') & mat))

data

df <- structure(list(V1 = c("mt-abc", "mt-dca", "cla", "dla"), V2 = c(1L, 
1L, 0L, 0L), V3 = 0:3, V4 = c(2L, 2L, 0L, 0L)), class = "data.frame", 
row.names = c(NA, -4L))

Upvotes: 1

zoowalk
zoowalk

Reputation: 2134

I don't think this is the most elegant approach in the tidyverse, but just out of curiosity:

library(tidyverse)
my_df <- data.frame(
  stringsAsFactors = FALSE,
               var = c("mt-abc", "mt-dca", "cla", "dla"),
                 x = c(1L, 1L, 0L, 0L),
                 y = c(0L, 1L, 2L, 3L),
                 z = c(2L, 2L, 0L, 0L)
)

df_1 <- my_df %>% 
  summarize(across(.cols=x:z, .fn=~sum(.x > 0))) %>% 
  mutate(var="above0")

df_2 <- my_df %>% 
  filter(str_detect(var, "^mt")) %>% 
  summarise(across(.cols=x:z, .fn=~sum(.x > 0))) %>% 
  mutate(var="mt")
bind_rows(df_1, df_2)
#>   x y z    var
#> 1 2 3 2 above0
#> 2 2 1 2     mt

Created on 2020-12-04 by the reprex package (v0.3.0)

Upvotes: 0

Related Questions