Reputation: 179
I want to remove columns that have only a unique value.
First, I try it for a single column and it works:
data %/%
select_if(length(unique(data$policy_id)) > 1)
then I try it for multiple columns as below:
data %/%
select_if(length(unique(data[, c("policy_date", "policy_id"])) > 1)
but it does not work. I think it is a conceptual mistake due to my lack of experience.
thanks in advance
Upvotes: 7
Views: 1444
Reputation: 886938
An option with base R
df[sapply(df, function(x) length(unique(x))) > 1]
df <- data.frame(A = LETTERS[1:5], B = 1:5, C = 2)
Upvotes: 0
Reputation: 3269
Another option would be to use purrr
:
df %>% purrr::keep(~all(n_distinct(.) > 1))
df %>% purrr::keep(~all(length(unique(.)) > 1))
df %>% purrr::discard(~!all(n_distinct(.) > 1))
df %>% purrr::discard(~!all(length(unique(.)) > 1))
Mixing table
with apply
generates the same output.
df[, apply(df, 2, function(i) length(table(i)) > 1)]
df <- data.frame(A = LETTERS[1:5], B = 1:5, C = 2)
Upvotes: 0
Reputation: 101034
Some base R options:
lengths
+ unique
+ sapply
subset(df,select = lengths(sapply(df,unique))>1)
Filter
+ length
+ unique
Filter(function(x) length(unique(x))>1,df)
Upvotes: 3
Reputation: 11584
Does this work:
> df <- data.frame(col1 = 1:10,
+ col2 = rep(10,10),
+ col3 = round(rnorm(10,1)))
> df
col1 col2 col3
1 1 10 1
2 2 10 0
3 3 10 1
4 4 10 1
5 5 10 1
6 6 10 0
7 7 10 2
8 8 10 1
9 9 10 1
10 10 10 1
> df %>% select_if(~length(unique(.)) > 1)
col1 col3
1 1 1
2 2 0
3 3 1
4 4 1
5 5 1
6 6 0
7 7 2
8 8 1
9 9 1
10 10 1
>
Upvotes: 1
Reputation: 173793
You can use select(where())
.
Suppose I have a data frame like this:
df <- data.frame(A = LETTERS[1:5], B = 1:5, C = 2)
df
#> A B C
#> 1 A 1 2
#> 2 B 2 2
#> 3 C 3 2
#> 4 D 4 2
#> 5 E 5 2
Then I can do:
df %>% select(where(~ n_distinct(.) > 1))
#> A B
#> 1 A 1
#> 2 B 2
#> 3 C 3
#> 4 D 4
#> 5 E 5
Upvotes: 4