LaTeXFan
LaTeXFan

Reputation: 1231

Check if a column in a dataframe is of the same value

It is a follow-up question to this one. What I would like to check is if any column in a data frame contain the same value (numerical or string) for all rows. For example,

sample <- data.frame(col1=c(1, 1, 1), col2=c("a", "a", "a"), col3=c(12, 15, 22))

The purpose is to inspect each column in a data frame to see which column does not have identical entry for all rows. How to do this? In particular, there are both numbers as well as strings.

My expected output would be a vector containing the column number which has non-identical entries.

Upvotes: 3

Views: 6890

Answers (2)

akrun
akrun

Reputation: 886938

We can use Filter

names(Filter(function(x) length(unique(x)) != 1, sample))
#[1] "col3"

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388817

We can use apply columnwise (margin = 2) and calculate unique values in the column and select the columns which has number of unique values not equal to 1.

which(apply(sample, 2, function(x) length(unique(x))) != 1)

#col3 
#   3 

The same code can also be done using sapply or lapply call

which(sapply(sample, function(x) length(unique(x))) != 1)
#col3 
#   3 

A dplyr version could be

library(dplyr)
sample %>%
  summarise_all(funs(n_distinct(.))) %>%
  select_if(. != 1)

#  col3
#1    3

Upvotes: 7

Related Questions