user2795569
user2795569

Reputation: 343

Remove invariant column in R

I want to remove a column if it is invariant for [2:nrow(df), ]

Just cant get this to work. Still new to R and programming in general.

red    <- c(1, 2, 3)
blue   <- c(4, 5, 4)
green  <- c(4, 7, 2)
colors <- data.frame(red, blue, green)
colors <- t(colors)
colors

      [,1] [,2] [,3]
red      1    2    3
blue     4    5    4
green    4    7    2

How to remove column 1 logically due to invariance of blue and green. It doesn't specifically need to be variance any method to remove columns that have all the same values will do the job.

Thanks so much!

Upvotes: 3

Views: 655

Answers (1)

Ricardo Saporta
Ricardo Saporta

Reputation: 55400

to remove a column, just reassign the object less the column:

colors <- colors[, -1]
colors

#       [,1] [,2]
# red      2    3
# blue     5    4
# green    7    2

If you have a list of columns to drop (technically a vector, not an R list), use:

toDrop <- c( <whichever columns to drop> )
colors <- colors[, -toDrop]

Alternatively, if you know which you are keeping:

toKeep <- c( <whichever columns to keep> )
colors <- colors[, toKeep]

As for determining if it is invariant, use duplicated, but not on the data.frame directly but rather on each column (using the function apply):

toDrop <- apply(colors[2:nrow(colors), ], 2, function(x) all(duplicated(x)[-1] ))

# Optionally:
toDrop <- which(toDrop)
if (length(toDrop))
   colors <- colors[, -toDrop]

Upvotes: 3

Related Questions