Reputation: 1231
It is a follow-up question to this one. What I would like to check is if any column in a data frame contain the same value (numerical or string) for all rows. For example,
sample <- data.frame(col1=c(1, 1, 1), col2=c("a", "a", "a"), col3=c(12, 15, 22))
The purpose is to inspect each column in a data frame to see which column does not have identical entry for all rows. How to do this? In particular, there are both numbers as well as strings.
My expected output would be a vector containing the column number which has non-identical entries.
Upvotes: 3
Views: 6890
Reputation: 886938
We can use Filter
names(Filter(function(x) length(unique(x)) != 1, sample))
#[1] "col3"
Upvotes: 1
Reputation: 388817
We can use apply
columnwise (margin = 2
) and calculate unique values in the column and select the columns which has number of unique values not equal to 1.
which(apply(sample, 2, function(x) length(unique(x))) != 1)
#col3
# 3
The same code can also be done using sapply
or lapply
call
which(sapply(sample, function(x) length(unique(x))) != 1)
#col3
# 3
A dplyr
version could be
library(dplyr)
sample %>%
summarise_all(funs(n_distinct(.))) %>%
select_if(. != 1)
# col3
#1 3
Upvotes: 7