Reputation: 12142
I am trying to order a data frame on multiple columns. And the column names are passed through variable, i.e. a character vector.
df <- data.frame(var1 = c("b","a","b","a"), var2 = c("l","l","k","k"),
var3 = c("t","w","x","t"))
var1 var2 var3
1 b l t
2 a l w
3 b k x
4 a k t
Sorting on one column using a variable
sortvar <- "var1"
df[order(df[ , sortvar]),]
var1 var2 var3
2 a l w
4 a k t
1 b l t
3 b k x
Now, if I want to order by two columns, the above solution does not work.
sortvar <- c("var1", "var2")
df[order(df[, sortvar]), ] #does not work
I can manually order with column names:
df[with(df, order(var1, var2)),]
var1 var2 var3
4 a k t
2 a l w
3 b k x
1 b l t
But, how do I order the data frame dynamically on multiple columns using a variable with column names? I am aware of the plyr
and dplyr
arrange
function, but I want to use base
R here.
Upvotes: 6
Views: 1323
Reputation: 206596
It's a bit awkward, but you can use do.call()
to pass each of the columns to order
as a different argument
dat[do.call("order", dat[,cols, drop=FALSE]), ]
I added drop=FALSE
just in case length(cols)==1
where indexing a data.frame would return a vector instead of a list. You can wrap it in a fucntion to make it a bit easier to use
order_by_cols <- function(data, cols=1) {
data[do.call("order", data[, cols, drop=FALSE]), ]
}
order_by_cols(dat, cols)
it's a bit easier with dplyr if that's something you might consider
library(dplyr)
dat %>% arrange(across(all_of(cols)))
dat %>% arrange_at(cols) # though this method has been superseded by the above line
Upvotes: 0
Reputation: 546133
order
expects multiple ordering variables as separate arguments, which is unfortunate in your case but suggests a direct solution: use do.call
:
df[do.call(order, df[, sortvar]), ]
In case you’re unfamiliar with do.call
: it constructs and executes a call programmatically. The following two statements are equivalent:
fun(arg1, arg2, …)
do.call(fun, list(arg1, arg2, …))
Upvotes: 10