Martien Lubberink
Martien Lubberink

Reputation: 2735

How to count the number of non-NA observations of a dataframe using Dplyr R (like df.count() in Python Pandas)

This should be easy: counting the number of non-missing observations in an R data frame.

I have a data frame where columns have missing (NA) values. I want to know which columns have too many missing observations. Simple question, and in Python this is straightforward, just run

df.count()

on a data frame, and presto: it shows the non-missing observations for every column.

However, using R's tidyverse this looks very convoluted. There are many suggestions regarding counting by groups, however, in this case I do not want to use grouped counts.

I tried:

mtcars %>%
  select(everything()) %>%  
  summarise_all(funs(sum(!is.na(.))))

which unfortunately throws an error, because 'funs()' was deprecated in dplyr 0.8.0.

Upvotes: 1

Views: 2248

Answers (2)

akrun
akrun

Reputation: 887118

We can use complete.cases

library(dplyr)
mtcars %>%
    summarise(across(everything(), ~ sum(complete.cases(.))))

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388982

summarise_all and funs are both deprecated. You can do this with across -

library(dplyr)
mtcars %>% summarise(across(.fns = ~sum(!is.na(.))))

Or in base R -

colSums(!is.na(mtcars))

Upvotes: 3

Related Questions