Reputation: 3062
I have the following data table:
my_dt = data.table(a = c(1,2,3), b = c(2,3,4), a = c(8,9,9))
> my_dt
a b a
1: 1 2 8
2: 2 3 9
3: 3 4 9
The table has two columns with the same name and I want to remove both of them. I could simply set a
to NULL and then do it again but I wanted to check if there is a data table way of doing this.
I tried the prescribed way (Removing multiple columns from R data.table with parameter for columns to remove) but I cannot get to work:
cols_to_delete = "a"
my_dt[, (cols_to_delete) := NULL]
#Only deletes the first occurence
> my_dt
b a
1: 2 8
2: 3 9
3: 4 9
cols_to_delete = c("a", "a")
my_dt[, (cols_to_delete) := NULL]
Error in `[.data.table`(my_dt, , `:=`((cols_to_delete), NULL)) :
Can't assign to the same column twice in the same query (duplicates detected).
I know that having the same column name is not ideal but I was wondering if there is some command that I am missing.
Upvotes: 1
Views: 658
Reputation: 1688
You can use indices instead.
cols_to_delete = c(1, 3)
# OR
# cols_to_delete <- which(duplicated(names(my_dt)) | duplicated(names(my_dt),fromLast = TRUE))
my_dt[, (cols_to_delete) := NULL]
Upvotes: 2
Reputation: 389055
You could select the columns which we want to keep instead of setting them to NULL
This can be done with
library(data.table)
cols_to_delete = "a"
my_dt[,names(my_dt) != cols_to_delete, with = FALSE]
# b
#1: 2
#2: 3
#3: 4
Or using setdiff
:
my_dt[,setdiff(names(my_dt), cols_to_delete), with = FALSE]
Also :
cols <- setdiff(names(my_dt), cols_to_delete)
my_dt[,..cols]
Upvotes: 1