Vassilis Chasiotis
Vassilis Chasiotis

Reputation: 439

How to sort a vector based on a list in R

For example, I have the vector

x=c(-1,-1,-1,-1,1,1,1,-1,-1,1,1,-1,-1,1,1,-1,1,-1,1).

And the following list:

Y =
list(1:7, 8:11, 12:15, 16:19)

How to sort the vector x based on list Y? I mean to sort the first 7 elements, the next 4, the next 4 and the last 4 AT THE SAME TIME.

The desired output should be c(-1,-1,-1,-1,1,1,1,-1,-1,1,1,-1,-1,1,1,-1,-1,1,1).

Note that list Y is not always the same.

I tried to use x[unlist(sapply(Y, sort))], but it does not work.

Do you have any option?

Upvotes: 1

Views: 133

Answers (3)

David Arenburg
David Arenburg

Reputation: 92300

You can also avoid a loop and vectorize the order using both x and Y at the same time (because order allows ordering by two vectors in case of ties)

x[order(rep(seq_along(Y), lengths(Y)), x)]
# [1] -1 -1 -1 -1  1  1  1 -1 -1  1  1 -1 -1  1  1 -1 -1  1  1

Some benchmark for illustration

set.seed(123)
N <- 1e5
x <- sample(N) 
Y <- split(1:N, rep(1 : (N/5), each = 5))


microbenchmark::microbenchmark("Gregor" = unlist(lapply(lapply(Y, function(i) x[i]), sort)),
                               "Frank1" = ave(x, stack(setNames(Y, seq_along(Y)))$ind, FUN = sort),
                               "Frank2" = x[order(stack(setNames(Y, seq_along(Y)))$ind, x)],
                               "Jilber" =  unlist(lapply(Y, function(z) sort(x[z]))),
                               "David" = x[order(rep(seq_along(Y), lengths(Y)), x)])


# Unit: milliseconds
#   expr        min         lq       mean     median         uq         max neval cld
# Gregor 904.277546 937.652137 958.911977 949.311012 961.324917 1164.024555   100   c
# Frank1 884.306262 922.496558 956.408754 941.394433 962.098976 1140.254656   100   c
# Frank2  27.384839  28.587845  30.806481  29.542219  31.684239   46.947814   100  b 
# Jilber 923.135901 949.318532 967.981792 962.090176 976.574574 1137.115863   100   c
# David   2.184901   2.326817   2.622338   2.492732   2.524091    8.586322   100 a  

Vectorized solution ~X500 faster than a loop

Upvotes: 2

Jilber Urbina
Jilber Urbina

Reputation: 61214

Not sure this is want you're asking for

> unlist(lapply(y, function(z) sort(x[z])))
 [1] -1 -1 -1 -1  1  1  1 -1 -1  1  1 -1 -1  1  1 -1 -1  1  1

Inputs

x <- c(-1,-1,-1,-1,1,1,1,-1,-1,1,1,-1,-1,1,1,-1,1,-1,1)
y <- list(1:7, 8:11, 12:15, 16:19)

Upvotes: 3

Gregor Thomas
Gregor Thomas

Reputation: 146224

unlist(lapply(lapply(Y, function(i) x[i]), sort))
# [1] -1 -1 -1 -1  1  1  1 -1 -1  1  1 -1 -1  1  1 -1 -1  1  1

This first extracts the elements of x according to the indices in Y into different list items, with

lapply(Y, function(i) x[i])

then it sorts each one independently with lapply(..., sort), then recombines them back into a vector with unlist.


Using this input:

x = c(-1,-1,-1,-1,1,1,1,-1,-1,1,1,-1,-1,1,1,-1,1,-1,1)
Y = list(1:7, 8:11, 12:15, 16:19)

Upvotes: 3

Related Questions