learnr
learnr

Reputation: 6649

How do I sort one vector based on values of another

I have a vector x, that I would like to sort based on the order of values in vector y. The two vectors are not of the same length.

x <- c(2, 2, 3, 4, 1, 4, 4, 3, 3)
y <- c(4, 2, 1, 3)

The expected result would be:

[1] 4 4 4 2 2 1 3 3 3

Upvotes: 134

Views: 96471

Answers (8)

zprinsloo
zprinsloo

Reputation: 11

You can also use the fast matching in {collapse} package for an answer similar to Yorgos above. This is more efficient that match(). You can see the documentation here.

x[order(collapse::fmatch(x,y))]

Just as an FYI, can also use the apply approach.

x[sapply(y, FUN = function(z){grep(pattern = z, x = x)}) |> unlist()]

However, this is more just out of interest. I would rather use the match() or fmatch() approaches. More elegant and more efficient, particularly in the fmatch() case.

Upvotes: 1

OmG
OmG

Reputation: 18838

Also you can use sqldf and do it by a join function in sql likes the following:

library(sqldf)
x <- data.frame(x = c(2, 2, 3, 4, 1, 4, 4, 3, 3))
y <- data.frame(y = c(4, 2, 1, 3))

result <- sqldf("SELECT x.x FROM y JOIN x on y.y = x.x")
ordered_x <- result[[1]]

Upvotes: 0

George Shimanovsky
George Shimanovsky

Reputation: 1956

In case you need to get order on "y" no matter if it's numbers or characters:

x[order(ordered(x, levels = y))]
4 4 4 2 2 1 3 3 3

By steps:

a <- ordered(x, levels = y) # Create ordered factor from "x" upon order in "y".
[1] 2 2 3 4 1 4 4 3 3
Levels: 4 < 2 < 1 < 3

b <- order(a) # Define "x" order that match to order in "y".
[1] 4 6 7 1 2 5 3 8 9

x[b] # Reorder "x" according to order in "y".
[1] 4 4 4 2 2 1 3 3 3

Upvotes: 2

Ben Bolker
Ben Bolker

Reputation: 226087

How about?:

rep(y,table(x)[as.character(y)])

(Ian's is probably still better)

Upvotes: 2

Yorgos
Yorgos

Reputation: 30445

what about this one

x[order(match(x,y))]

Upvotes: 230

Shane
Shane

Reputation: 100154

[Edit: Clearly Ian has the right approach, but I will leave this in for posterity.]

You can do this without loops by indexing on your y vector. Add an incrementing numeric value to y and merge them:

y <- data.frame(index=1:length(y), x=y)
x <- data.frame(x=x)
x <- merge(x,y)
x <- x[order(x$index),"x"]
x
[1] 4 4 4 2 2 1 3 3 3

Upvotes: 1

Matt Parker
Matt Parker

Reputation: 27339

You could convert x into an ordered factor:

x.factor <- factor(x, levels = y, ordered=TRUE)
sort(x)
sort(x.factor)

Obviously, changing your numbers into factors can radically change the way code downstream reacts to x. But since you didn't give us any context about what happens next, I thought I would suggest this as an option.

Upvotes: 6

Godeke
Godeke

Reputation: 16281

x <- c(2, 2, 3, 4, 1, 4, 4, 3, 3)
y <- c(4, 2, 1, 3)
for(i in y) { z <- c(z, rep(i, sum(x==i))) }

The result in z: 4 4 4 2 2 1 3 3 3

The important steps:

  1. for(i in y) -- Loops over the elements of interest.

  2. z <- c(z, ...) -- Concatenates each subexpression in turn

  3. rep(i, sum(x==i)) -- Repeats i (the current element of interest) sum(x==i) times (the number of times we found i in x).

Upvotes: 0

Related Questions