User2321
User2321

Reputation: 3062

How to remove two columns with the same name in data table R

I have the following data table:

my_dt = data.table(a = c(1,2,3), b = c(2,3,4), a = c(8,9,9))
> my_dt
   a b a
1: 1 2 8
2: 2 3 9
3: 3 4 9

The table has two columns with the same name and I want to remove both of them. I could simply set a to NULL and then do it again but I wanted to check if there is a data table way of doing this. I tried the prescribed way (Removing multiple columns from R data.table with parameter for columns to remove) but I cannot get to work:

cols_to_delete = "a"
my_dt[, (cols_to_delete) := NULL]
#Only deletes the first occurence
> my_dt
   b a
1: 2 8
2: 3 9
3: 4 9

cols_to_delete = c("a", "a")
my_dt[, (cols_to_delete) := NULL]
Error in `[.data.table`(my_dt, , `:=`((cols_to_delete), NULL)) : 
  Can't assign to the same column twice in the same query (duplicates detected).

I know that having the same column name is not ideal but I was wondering if there is some command that I am missing.

Upvotes: 1

Views: 658

Answers (2)

Frank Zhang
Frank Zhang

Reputation: 1688

You can use indices instead.

cols_to_delete = c(1, 3)
# OR
# cols_to_delete <- which(duplicated(names(my_dt))  | duplicated(names(my_dt),fromLast = TRUE))
my_dt[, (cols_to_delete) := NULL]

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389055

You could select the columns which we want to keep instead of setting them to NULL

This can be done with

library(data.table)
cols_to_delete = "a"
my_dt[,names(my_dt) != cols_to_delete, with = FALSE]

#   b
#1: 2
#2: 3
#3: 4

Or using setdiff :

my_dt[,setdiff(names(my_dt), cols_to_delete), with = FALSE]

Also :

cols <- setdiff(names(my_dt), cols_to_delete)
my_dt[,..cols]

Upvotes: 1

Related Questions