Reputation: 11793
I create a dataframe df.
df <- data.frame (id = 1:10,
var1 = 10:19,
var2 = sample(c(1:2,NA), 10, replace=T),
var3 = sample(c(3:5, NA), 10, replace=T))
What I need is a new column var4, which count the number of non-NA values of each row (excluding the id column). So for example, if a row is like var1=19, var2=1, var3=NA, then var4=2. I could not find a good way to do this in dplyr. something like:
df %in% mutate(var4= ... )
I appreciate if anyone can help me with that.
Upvotes: 10
Views: 7761
Reputation: 1484
You can as well use pick
df %>% mutate(var4 = rowSums(!is.na(pick(-id))))
Upvotes: 1
Reputation: 6206
Another solution using only base-r
data.frame(df, var4 = apply(df[,-1], 1, function(x) sum(!is.na(x))))
id var1 var2 var3 var4
1 1 10 1 5 3
2 2 11 2 5 3
3 3 12 2 5 3
4 4 13 NA 3 2
5 5 14 NA 5 2
6 6 15 1 5 3
7 7 16 NA 3 2
8 8 17 NA 4 2
9 9 18 NA 3 2
10 10 19 1 4 3
Upvotes: 0
Reputation: 2500
I know the OP asked for a dplyr
solution, but base R is straightforward here:
df$var4 <- rowSums(!is.na(df[,2:4]))
rowSums
calculates the number of values that are not NA (!is.na
) in columns 2 - 4.
Note, this is summing the logical vector generated by is.na
, which is distinct from:
rowSums(df[,2:4], na.rm = TRUE)
Which drops the NA
s and then sums the remaining values.
Upvotes: 3
Reputation: 214927
Use select
+ is.na
+ rowSums
, select(., -id)
returns the original data frame (.
) with id
excluded, and then count number of non-NA values with rowSums(!is.na(...))
:
df %>% mutate(var4 = rowSums(!is.na(select(., -id))))
# id var1 var2 var3 var4
#1 1 10 NA 4 2
#2 2 11 1 NA 2
#3 3 12 2 5 3
#4 4 13 2 NA 2
#5 5 14 1 NA 2
#6 6 15 1 NA 2
#7 7 16 1 5 3
#8 8 17 NA 4 2
#9 9 18 NA 4 2
#10 10 19 NA NA 1
Upvotes: 15