Rob Creel
Rob Creel

Reputation: 343

In R, how do I drop a column whose values are all FALSE?

I have a dataframe, df. Some of its columns include logicals. I would like to drop the ones that are all FALSE.

library(tibble)
df <- tibble(A = rep(TRUE, 5),
             B = rep(FALSE, 5),
             C = c(TRUE, FALSE, TRUE, TRUE, FALSE))

df

# A tibble: 5 x 3
  A     B     C    
  <lgl> <lgl> <lgl>
1 TRUE  FALSE TRUE 
2 TRUE  FALSE FALSE
3 TRUE  FALSE TRUE 
4 TRUE  FALSE TRUE 
5 TRUE  FALSE FALSE

The desired output is:

  A     C    
  <lgl> <lgl>
1 TRUE  TRUE 
2 TRUE  FALSE
3 TRUE  TRUE 
4 TRUE  TRUE 
5 TRUE  FALSE

I have tried selecting constant columns using the janitor package, but that will remove columns that are all TRUE also.

How may I do this? (I prefer a tidyverse solution, but barring that base R or some other available package is acceptable.)

Edit: My minimal working example above was too minimal. I should have mentioned that there are non-logical columns I want to keep too. The solution for me, provided by akrun in chat, was:

library(dplyr)
library(purrr)
df %>% select(where(~ is.logical(.) && any(.)), where(negate(is.logical)))

Upvotes: 4

Views: 2921

Answers (3)

Konrad
Konrad

Reputation: 18595

That's cleanest solution:

select_if(.tbl = df, .predicate = any)

Explanation:

  • .predicate - applied to columns, will leave columns for which returned values are all TRUE
  • any - will return TRUE for any TRUE values present. Will also work for combinations any(0,-1). There is one edge case where any(0, 0) would return FALSE.
    • If you may have a column that may contain only 0s you may want implement additional check. Again that would be equivalent to any(NULL, NULL)

Let's say that you want to avoid those edge cases, better option:

select_if(.tbl = df, .predicate = ~ all(isFALSE(.x)))

Upvotes: 2

akrun
akrun

Reputation: 887501

Using base R with Filter and any

Filter(any, df)

Or in dplyr

library(dplyr)
df %>%
    select(where(any))

-output

# A tibble: 5 x 2
#  A     C    
#  <lgl> <lgl>
#1 TRUE  TRUE 
#2 TRUE  FALSE
#3 TRUE  TRUE 
#4 TRUE  TRUE 
#5 TRUE  FALSE

Based on the OP's comments, wanted to keep columns that are not logical in type along with columns with logical type and any TRUE

library(purrr)
df %>% 
  select(where(~ is.logical(.) && any(.)), where(negate(is.logical)))

Upvotes: 9

ThomasIsCoding
ThomasIsCoding

Reputation: 102241

A base R option

df[colSums(df)>0]

or

df[unique(which(as.matrix(df),arr.ind = TRUE)[,"col"])]

Upvotes: 1

Related Questions