Jakob
Jakob

Reputation: 1463

R: change column order in data.table for only some columns

I want to sort two columns to the front of my data.table (id and time in my case). Say I have:

library(data.table)
Data <- as.data.table(iris)

and say I want the order of the columns to be:

example <- Data
setcolorder(example,c("Species","Petal.Length","Sepal.Length",
                      "Sepal.Width","Petal.Length","Petal.Width"))

but my actual data table has many more variables so I would like to adress this as:

setcolorder(Data, c("Species","Petal.Length", 
                    ...all other variables in their original order...))

I played around with something like:

setcolorder(Data,c("Species","Petal.Length",
                    names(Data)[!c("Species","Petal.Length")]))

but I have a problem subsetting the character vector names(Data) by name reference. Also I'm sure I can avoid this workaround with some neat data.table function, no?

Upvotes: 7

Views: 5438

Answers (3)

Valentas
Valentas

Reputation: 2245

You can just do

setcolorder(Data,c("Species","Petal.Length"))

similarly as using xcols in kdb q. ?setcolorder says:

If ‘length(neworder) < length(x)’, the specified columns are moved in order to the "front" of ‘x’.

My version of data.table is 1.11.4, but it might have been available for earlier versions too.

Upvotes: 5

Clayton Stanley
Clayton Stanley

Reputation: 7784

This is totally a riff off of Akrun's solution, using a bit more functional decomposition and an anaphoric macro because, well why not.

I'm no expert in writing R macros, so this is probably a naive solution.

> toFront <- function(vect, ...) {
   c(..., setdiff(vect, c(...)))
}
> withColnames <- function(tbl, thunk) {
  .CN = colnames(tbl)
  eval(substitute(thunk))
}
> vect = c('c', 'd', 'e', 'a', 'b')
> tbl = data.table(1,2,3,4,5)
> setnames(tbl, vect)
> tbl
   c d e a b
1: 1 2 3 4 5
> withColnames(tbl, setcolorder(tbl, toFront(.CN, 'a', 'b') ))
> tbl
   a b c d e
1: 4 5 1 2 3
> 

Upvotes: 1

akrun
akrun

Reputation: 887751

We can use setdiff to subset all the column names that are not in the subset of names i.e. 'nm1', concatenate that with 'nm1' in the setcolorder

 nm1 <- c("Species", "Petal.Length")
 setcolorder(Data, c(nm1, setdiff(names(Data), nm1)))

 names(Data)
 #[1] "Species"      "Petal.Length" "Sepal.Length" "Sepal.Width"  "Petal.Width" 

A convenience function for this is:

setcolfirst = function(DT, ...){
  nm = as.character(substitute(c(...)))[-1L]
  setcolorder(DT, c(nm, setdiff(names(DT), nm)))
} 

setcolfirst(Data, Species, Petal.Length)

The columns are passed without quotes here, but extension to a character vector is easy.

Upvotes: 10

Related Questions