Reputation: 1453
I want to sort a character vector that looks like this:
x <- c("white","white","blue","green","red","blue","red")
according to a specific order that looks like this:
y <- c("r","white","bl","gree")
If the second vector would be spelled out, the answer can be found here. However, in reality my first character vector has very long entries and the second vector has much shorter but still long entries. All entries are of different character length. My goal still is c("red","red","white","white","blue","blue", "green")
. I actually only have unique entries in both vectors but I guess the question will be more useful if we have a general answer? How could I approach this?
Upvotes: 2
Views: 137
Reputation: 39647
You can use grep
in combination with sapply
. But it will only work when there is no overlap in y
. It will only return hits between x
and y
. With ^
you say that it need to be at the begin. value = TRUE
says that it should return the string where it has a hit.
unlist(sapply(paste0("^",y), grep, x, value = TRUE))
# ^r1 ^r2 ^white1 ^white2 ^bl1 ^bl2 ^gree
# "red" "red" "white" "white" "blue" "blue" "green"
The following will also work with an overlap in y and takes the first hit.
x <- c(x, "redd"); y <- c(y, "redd")
x[unique(unlist(sapply(paste0("^",y), grep, x)))]
#[1] "red" "red" "redd" "white" "white" "blue" "blue" "green"
or get the last hit:
x[unique(unlist(sapply(paste0("^",y), grep, x)), fromLast = TRUE)]
[1] "red" "red" "white" "white" "blue" "blue" "green" "redd"
To get all x and place the no-match and the end you can use:
x <- c(x, "yellow")
x[unique(c(unlist(sapply(paste0("^",y), grep, x)), seq_along(x)))]
[1] "red" "red" "redd" "white" "white" "blue" "blue" "green"
[9] "yellow"
Upvotes: 4