Reputation: 147
I have the following data structure:
id val1 val2 val3
1 1 5 2
1 2 7 5
2 2 2 NA
2 NA 4 NA
3 4 8 4
3 3 NA 4
I want to count the number of oberservations for each variable, excluding the NA values. Result should be like:
id val1 val2 val3
1 2 2 2
2 1 2 NA
3 2 1 2
I tried the following lines but it would not work:
df %>%
group_by(id) %>%
summarise_all(funs(n(., na.rm = TRUE)))
Hoping for your suggestions and advice.
Upvotes: 3
Views: 250
Reputation: 101337
Another base R option using ´aggregate`
> aggregate(!is.na(df[-1]), df[1], sum)
id val1 val2 val3
1 1 2 2 2
2 2 1 2 0
3 3 2 1 2
Data
> dput(df)
structure(list(id = c(1L, 1L, 2L, 2L, 3L, 3L), val1 = c(1L, 2L,
2L, NA, 4L, 3L), val2 = c(5L, 7L, 2L, 4L, 8L, NA), val3 = c(2L,
5L, NA, NA, 4L, 4L)), class = "data.frame", row.names = c(NA,
-6L))
Upvotes: 1
Reputation: 887088
We can use data.table
approach
library(data.table)
setDT(df)[, lapply(.SD, function(x) sum(!is.na(x))), id]
The summarise_all
should be
library(dplyr)
df %>%
group_by(id) %>%
summarise_all(~ sum(!is.na(.), na.rm = TRUE))
Or using collapse
library(collapse)
fsum(!is.na(slt(df,-id)), g = GRP(df, ~ id))
# val1 val2 val3
#1 2 2 2
#2 1 2 0
#3 2 1 2
Upvotes: 1
Reputation: 388982
You can use sum(!is.na(.))
to count non-NA values.
library(dplyr)
df %>% group_by(id) %>% summarise(across(.fns = ~sum(!is.na(.))))
# id val1 val2 val3
# <int> <int> <int> <int>
#1 1 2 2 2
#2 2 1 2 0
#3 3 2 1 2
In base R :
aggregate(.~id, df, function(x) sum(!is.na(x)), na.action = 'na.keep')
Upvotes: 4