Reputation: 343
I have a dataframe, df
. Some of its columns include logicals. I would like to drop the ones that are all FALSE
.
library(tibble)
df <- tibble(A = rep(TRUE, 5),
B = rep(FALSE, 5),
C = c(TRUE, FALSE, TRUE, TRUE, FALSE))
df
# A tibble: 5 x 3
A B C
<lgl> <lgl> <lgl>
1 TRUE FALSE TRUE
2 TRUE FALSE FALSE
3 TRUE FALSE TRUE
4 TRUE FALSE TRUE
5 TRUE FALSE FALSE
The desired output is:
A C
<lgl> <lgl>
1 TRUE TRUE
2 TRUE FALSE
3 TRUE TRUE
4 TRUE TRUE
5 TRUE FALSE
I have tried selecting constant columns using the janitor package, but that will remove columns that are all TRUE
also.
How may I do this? (I prefer a tidyverse solution, but barring that base R or some other available package is acceptable.)
Edit: My minimal working example above was too minimal. I should have mentioned that there are non-logical columns I want to keep too. The solution for me, provided by akrun in chat, was:
library(dplyr)
library(purrr)
df %>% select(where(~ is.logical(.) && any(.)), where(negate(is.logical)))
Upvotes: 4
Views: 2921
Reputation: 18595
That's cleanest solution:
select_if(.tbl = df, .predicate = any)
Explanation:
.predicate
- applied to columns, will leave columns for which returned values are all TRUE
any
- will return TRUE
for any TRUE
values present. Will also work for combinations any(0,-1)
. There is one edge case where any(0, 0)
would return FALSE
.
0
s you may want implement additional check. Again that would be equivalent to any(NULL, NULL)
Let's say that you want to avoid those edge cases, better option:
select_if(.tbl = df, .predicate = ~ all(isFALSE(.x)))
Upvotes: 2
Reputation: 887501
Using base R
with Filter
and any
Filter(any, df)
Or in dplyr
library(dplyr)
df %>%
select(where(any))
-output
# A tibble: 5 x 2
# A C
# <lgl> <lgl>
#1 TRUE TRUE
#2 TRUE FALSE
#3 TRUE TRUE
#4 TRUE TRUE
#5 TRUE FALSE
Based on the OP's comments, wanted to keep columns that are not logical in type along with columns with logical
type and any
TRUE
library(purrr)
df %>%
select(where(~ is.logical(.) && any(.)), where(negate(is.logical)))
Upvotes: 9
Reputation: 102241
A base R option
df[colSums(df)>0]
or
df[unique(which(as.matrix(df),arr.ind = TRUE)[,"col"])]
Upvotes: 1