fleems
fleems

Reputation: 109

Counting over multiple columns, ignoring NA

I have a data set where I want to count the number of times a value ( larger than 0 ) is present in a range of columns, represented in x_times.

Example:

.    x1   x2   x3   x4  x_times
1    NA    0    0    0   0
2     0   NA    0    0   0
3     0    0   NA    1   1
4     1    0    1   NA   2
5    NA    0    0    0   0
6     0   NA    0    0   0
7     0    0   NA    0   0
8     1    2    4   NA   3
9    NA    0    0    0   0
10    0   NA    1    5   2
11    0    0   NA    0   0
12    0    1    0   NA   1

I fiddled around with ifelse statements, but with 4 columns that's a bit excessive. I could, but there has to be an easier way.

count <- 0
with(df5, ifelse(df$x1 > 0, count+1, ifelse(df$x1 == 0 , count+0 ,
    ifelse(df$x2 > 0, count+1, ifelse(df$x2 == 0, count+0,
     ifelse(df$x3 > 0, count+1, ifelse(df$x3 == 0, count+0, 
       ifelse(df$x4 > 0, count+1, ifelse(df$x4 == 0, count+0, 0 
 )))))))))

Side question, can I do +1 in ifelse statements like this? These lines of ifelse code do not work, it changes the values instead of adding them.

I tried tinkering with this: apply(cbind(df$x1, df$x2, df$x3,df$x4), 1, function(x) ifelse(all(is.na(x)), NA, nrow(x, na.rm=T))) but to no avail.

dplyr::count() gives back errors about integers

rowSums adds up the values, I just want a +1 when x > 0.

I feel like this is a simple question, but I can't figure it out in a concise manner. Answers with a loop would be appreciated for the larger datasets, but it does not have to be a function/for loop per se.

Upvotes: 1

Views: 208

Answers (1)

parkerchad81
parkerchad81

Reputation: 558

As @Sotos says in the comments you can just use rowSums.

 dat <- 
   data.frame(x1 = c(rep(c(NA, 0, 0, 1), 2), NA, 0, 0, 0),
              x2 = c(0, NA, 0, 0, 0, NA, 0, 2, 0, NA, 0, 1),
              x3 = c(0, 0, NA, 1, 0, 0, NA, 4, 0, 1, NA, 0),
              x4 = c(0, 0, 1, NA, 0, 0, 0, NA, 0, 5, 0, NA))

library(tidyverse)
> dat %>% mutate(x_times = rowSums(.>0, na.rm = TRUE))

   x1 x2 x3 x4 x_times
1  NA  0  0  0       0
2   0 NA  0  0       0
3   0  0 NA  1       1
4   1  0  1 NA       2
5  NA  0  0  0       0
6   0 NA  0  0       0
7   0  0 NA  0       0
8   1  2  4 NA       3
9  NA  0  0  0       0
10  0 NA  1  5       2
11  0  0 NA  0       0
12  0  1  0 NA       1

Upvotes: 3

Related Questions