887
887

Reputation: 619

Count non-NA observations by row in selected columns

I am working with the following data frame:

Name  Color   L1    L2    R3 
Joe    Red    5.4   6.2   7.7
Eric   Blue   NA    4.1   6.1
Steve  Green  NA    NA    1.2
Mike   Red    NA    NA    NA

I would like to count the number of observations across columns L1, L2, and L3, but not count NA values. So I would like the above table to become:

Name  Color   L1    L2    R3    Count 
Joe    Red    5.4   6.2   7.7    3
Eric   Blue   NA    4.1   6.1    2
Steve  Green  NA    NA    1.2    1
Mike   Red    NA    NA    NA     0

Thanks in advance!

Upvotes: 3

Views: 1983

Answers (3)

akrun
akrun

Reputation: 887881

We can use rowSums on a logical matrix i.e. is.na(df1[3:5]) returns a logical matrix where TRUE -> NA and FALSE -> nonNA, by negating (!), TRUE -> FALSE and viceversa. Then, with rowSums, the TRUE (TRUE/FALSE -> 1/0) are added together

df1$Count <- rowSums(!is.na(df1[3:5]))

Or using tidyverse

library(dplyr)
df1 %>%
    mutate(Count = rowSums(!is.na(select(., 3:5))))

-output

#    Name Color  L1  L2  R3 Count
#1   Joe   Red 5.4 6.2 7.7     3
#2  Eric  Blue  NA 4.1 6.1     2
#3 Steve Green  NA  NA 1.2     1
#4  Mike   Red  NA  NA  NA     0

Or if we need to use column names in a range, use the select

df1 %>%
    mutate(Count = rowSums(!is.na(select(., L1:R3))))
 

NOTE: rowSums is vectorized and would be very fast

data

df1 <- structure(list(Name = c("Joe", "Eric", "Steve", "Mike"), Color = c("Red", 
"Blue", "Green", "Red"), L1 = c(5.4, NA, NA, NA), L2 = c(6.2, 
4.1, NA, NA), R3 = c(7.7, 6.1, 1.2, NA)), class = "data.frame",
row.names = c(NA, 
-4L))

Upvotes: 5

Duck
Duck

Reputation: 39613

Try this using dplyr:

library(dplyr)
#Code
newdf <- df %>% rowwise() %>% mutate(Count=sum(!is.na(c_across(L1:R3))))

Output:

# A tibble: 4 x 6
# Rowwise: 
  Name  Color    L1    L2    R3 Count
  <chr> <chr> <dbl> <dbl> <dbl> <int>
1 Joe   Red     5.4   6.2   7.7     3
2 Eric  Blue   NA     4.1   6.1     2
3 Steve Green  NA    NA     1.2     1
4 Mike  Red    NA    NA    NA       0

Some data used:

#Data
df <- structure(list(Name = c("Joe", "Eric", "Steve", "Mike"), Color = c("Red", 
"Blue", "Green", "Red"), L1 = c(5.4, NA, NA, NA), L2 = c(6.2, 
4.1, NA, NA), R3 = c(7.7, 6.1, 1.2, NA)), class = "data.frame", row.names = c(NA, 
-4L))

Upvotes: 1

SteveM
SteveM

Reputation: 2301

Simple base R solution:

df1$Count <- apply(df1[3:5], 1, function(x) sum(!is.na(x)))

Upvotes: 1

Related Questions