Reputation: 619
I am working with the following data frame:
Name Color L1 L2 R3
Joe Red 5.4 6.2 7.7
Eric Blue NA 4.1 6.1
Steve Green NA NA 1.2
Mike Red NA NA NA
I would like to count the number of observations across columns L1, L2, and L3, but not count NA values. So I would like the above table to become:
Name Color L1 L2 R3 Count
Joe Red 5.4 6.2 7.7 3
Eric Blue NA 4.1 6.1 2
Steve Green NA NA 1.2 1
Mike Red NA NA NA 0
Thanks in advance!
Upvotes: 3
Views: 1983
Reputation: 887881
We can use rowSums
on a logical matrix i.e. is.na(df1[3:5])
returns a logical matrix where TRUE -> NA and FALSE -> nonNA, by negating (!
), TRUE -> FALSE and viceversa. Then, with rowSums
, the TRUE (TRUE/FALSE -> 1/0) are added together
df1$Count <- rowSums(!is.na(df1[3:5]))
Or using tidyverse
library(dplyr)
df1 %>%
mutate(Count = rowSums(!is.na(select(., 3:5))))
-output
# Name Color L1 L2 R3 Count
#1 Joe Red 5.4 6.2 7.7 3
#2 Eric Blue NA 4.1 6.1 2
#3 Steve Green NA NA 1.2 1
#4 Mike Red NA NA NA 0
Or if we need to use column names in a range, use the select
df1 %>%
mutate(Count = rowSums(!is.na(select(., L1:R3))))
NOTE: rowSums
is vectorized and would be very fast
df1 <- structure(list(Name = c("Joe", "Eric", "Steve", "Mike"), Color = c("Red",
"Blue", "Green", "Red"), L1 = c(5.4, NA, NA, NA), L2 = c(6.2,
4.1, NA, NA), R3 = c(7.7, 6.1, 1.2, NA)), class = "data.frame",
row.names = c(NA,
-4L))
Upvotes: 5
Reputation: 39613
Try this using dplyr
:
library(dplyr)
#Code
newdf <- df %>% rowwise() %>% mutate(Count=sum(!is.na(c_across(L1:R3))))
Output:
# A tibble: 4 x 6
# Rowwise:
Name Color L1 L2 R3 Count
<chr> <chr> <dbl> <dbl> <dbl> <int>
1 Joe Red 5.4 6.2 7.7 3
2 Eric Blue NA 4.1 6.1 2
3 Steve Green NA NA 1.2 1
4 Mike Red NA NA NA 0
Some data used:
#Data
df <- structure(list(Name = c("Joe", "Eric", "Steve", "Mike"), Color = c("Red",
"Blue", "Green", "Red"), L1 = c(5.4, NA, NA, NA), L2 = c(6.2,
4.1, NA, NA), R3 = c(7.7, 6.1, 1.2, NA)), class = "data.frame", row.names = c(NA,
-4L))
Upvotes: 1
Reputation: 2301
Simple base R solution:
df1$Count <- apply(df1[3:5], 1, function(x) sum(!is.na(x)))
Upvotes: 1