Reputation: 2735
This should be easy: counting the number of non-missing observations in an R data frame.
I have a data frame where columns have missing (NA) values. I want to know which columns have too many missing observations. Simple question, and in Python this is straightforward, just run
df.count()
on a data frame, and presto: it shows the non-missing observations for every column.
However, using R's tidyverse this looks very convoluted. There are many suggestions regarding counting by groups, however, in this case I do not want to use grouped counts.
I tried:
mtcars %>%
select(everything()) %>%
summarise_all(funs(sum(!is.na(.))))
which unfortunately throws an error, because 'funs()' was deprecated in dplyr 0.8.0.
Upvotes: 1
Views: 2248
Reputation: 887118
We can use complete.cases
library(dplyr)
mtcars %>%
summarise(across(everything(), ~ sum(complete.cases(.))))
Upvotes: 0
Reputation: 388982
summarise_all
and funs
are both deprecated. You can do this with across
-
library(dplyr)
mtcars %>% summarise(across(.fns = ~sum(!is.na(.))))
Or in base R -
colSums(!is.na(mtcars))
Upvotes: 3