Reputation: 109
I have a data set where I want to count the number of times a value ( larger than 0 ) is present in a range of columns, represented in x_times.
Example:
. x1 x2 x3 x4 x_times
1 NA 0 0 0 0
2 0 NA 0 0 0
3 0 0 NA 1 1
4 1 0 1 NA 2
5 NA 0 0 0 0
6 0 NA 0 0 0
7 0 0 NA 0 0
8 1 2 4 NA 3
9 NA 0 0 0 0
10 0 NA 1 5 2
11 0 0 NA 0 0
12 0 1 0 NA 1
I fiddled around with ifelse statements, but with 4 columns that's a bit excessive. I could, but there has to be an easier way.
count <- 0
with(df5, ifelse(df$x1 > 0, count+1, ifelse(df$x1 == 0 , count+0 ,
ifelse(df$x2 > 0, count+1, ifelse(df$x2 == 0, count+0,
ifelse(df$x3 > 0, count+1, ifelse(df$x3 == 0, count+0,
ifelse(df$x4 > 0, count+1, ifelse(df$x4 == 0, count+0, 0
)))))))))
Side question, can I do +1 in ifelse statements like this? These lines of ifelse code do not work, it changes the values instead of adding them.
I tried tinkering with this: apply(cbind(df$x1, df$x2, df$x3,df$x4), 1, function(x) ifelse(all(is.na(x)), NA, nrow(x, na.rm=T)))
but to no avail.
dplyr::count()
gives back errors about integers
rowSums
adds up the values, I just want a +1 when x > 0
.
I feel like this is a simple question, but I can't figure it out in a concise manner. Answers with a loop would be appreciated for the larger datasets, but it does not have to be a function/for loop per se.
Upvotes: 1
Views: 208
Reputation: 558
As @Sotos says in the comments you can just use rowSums.
dat <-
data.frame(x1 = c(rep(c(NA, 0, 0, 1), 2), NA, 0, 0, 0),
x2 = c(0, NA, 0, 0, 0, NA, 0, 2, 0, NA, 0, 1),
x3 = c(0, 0, NA, 1, 0, 0, NA, 4, 0, 1, NA, 0),
x4 = c(0, 0, 1, NA, 0, 0, 0, NA, 0, 5, 0, NA))
library(tidyverse)
> dat %>% mutate(x_times = rowSums(.>0, na.rm = TRUE))
x1 x2 x3 x4 x_times
1 NA 0 0 0 0
2 0 NA 0 0 0
3 0 0 NA 1 1
4 1 0 1 NA 2
5 NA 0 0 0 0
6 0 NA 0 0 0
7 0 0 NA 0 0
8 1 2 4 NA 3
9 NA 0 0 0 0
10 0 NA 1 5 2
11 0 0 NA 0 0
12 0 1 0 NA 1
Upvotes: 3