mindlessgreen
mindlessgreen

Reputation: 12142

Order a data frame programmatically using a character vector of column names

I am trying to order a data frame on multiple columns. And the column names are passed through variable, i.e. a character vector.

df <- data.frame(var1 = c("b","a","b","a"), var2 = c("l","l","k","k"),
                 var3 = c("t","w","x","t"))

  var1 var2 var3
1    b    l    t
2    a    l    w
3    b    k    x
4    a    k    t

Sorting on one column using a variable

sortvar <- "var1"
df[order(df[ , sortvar]),]

  var1 var2 var3
2    a    l    w
4    a    k    t
1    b    l    t
3    b    k    x

Now, if I want to order by two columns, the above solution does not work.

sortvar <- c("var1", "var2")
df[order(df[, sortvar]), ] #does not work

I can manually order with column names:

df[with(df, order(var1, var2)),]

  var1 var2 var3
4    a    k    t
2    a    l    w
3    b    k    x
1    b    l    t

But, how do I order the data frame dynamically on multiple columns using a variable with column names? I am aware of the plyr and dplyr arrange function, but I want to use base R here.

Upvotes: 6

Views: 1323

Answers (2)

MrFlick
MrFlick

Reputation: 206596

It's a bit awkward, but you can use do.call() to pass each of the columns to order as a different argument

dat[do.call("order", dat[,cols, drop=FALSE]), ]

I added drop=FALSE just in case length(cols)==1 where indexing a data.frame would return a vector instead of a list. You can wrap it in a fucntion to make it a bit easier to use

order_by_cols <- function(data, cols=1) {
  data[do.call("order", data[, cols, drop=FALSE]), ]
}

order_by_cols(dat, cols)

it's a bit easier with dplyr if that's something you might consider

library(dplyr)
dat %>% arrange(across(all_of(cols)))
dat %>% arrange_at(cols)  # though this method has been superseded by the above line

Upvotes: 0

Konrad Rudolph
Konrad Rudolph

Reputation: 546133

order expects multiple ordering variables as separate arguments, which is unfortunate in your case but suggests a direct solution: use do.call:

df[do.call(order, df[, sortvar]), ]

In case you’re unfamiliar with do.call: it constructs and executes a call programmatically. The following two statements are equivalent:

fun(arg1, arg2, …)
do.call(fun, list(arg1, arg2, …))

Upvotes: 10

Related Questions