Reputation: 61
I am trying to get a vector of the unique elements of two vectors that respects the order of both of the original vectors.
The vectors are both sampled from a longer "hidden" vector that only contains unique entries (i.e. no repeats are allowed), which ensures both v1 and v2 have a compatible order (i.e. v1<-("Z","A",...) and v2<-("A","Z",...) can not occur).
The order is arbitrary, so I cannot use any simple order() or sort(). An example below:
v1 <- c("Z", "A", "F", "D")
v2 <- c("A", "T", "F", "Q", "D")
Result desired:
c("Z", "A", "T", "F", "Q", "D") or
Further explanation: v1 establishes the relationship "Z" < "A" < "F" < "D" and v2 states "A" < "T" < "F" < "Q" < "D" so the sequence that satisfies v1 and v2 is "Z" < "A" < "T" < "F" < "Q" < "D"
I understand this case is fully determined (the two vectors do completely define the order of all elements), but there would be cases when this is not enough. In that case, any permutation that respects the two sets of ordering would be a satisfactory solution.
Any tips will be appreciated.
Upvotes: 4
Views: 356
Reputation: 39667
You can get unique
from v1
and v2
and resort it using match
on v1
and v2
and repeat this until no change happens.
x <- unique(c(v1, v2))
repeat {
y <- x
i <- match(v2, x)
x[sort(i)] <- x[i]
i <- match(v1, x)
x[sort(i)] <- x[i]
if(identical(x, y)) break;
}
x
#[1] "Z" "A" "T" "F" "Q" "D"
Alternative you can get the overlapping letters of v1
and v2
and then join to this anchor points the subsets of v1
and v2
:
i <- v2[na.omit(match(v1, v2))]
j <- c(0, match(i, v2))
i <- c(0, match(i, v1))
unique(c(unlist(lapply(seq_along(i)[-1], function(k) {
c(v1[head((i[k-1]:i[k]), -1)], v2[head((j[k-1]:j[k])[-1], -1)])
})), v1, v2))
#[1] "Z" "A" "T" "F" "Q" "D"
Upvotes: 4
Reputation: 3256
For this example the next code works. One first has to define auxiliar vectors w1
, w2
depending on which has the first common element and another vector w
on which to append the lacking elements by order.
It would be clearer using a for
loop, which would avoid this cumbersome code, but at first, this is faster and shorter.
w <- w1 <- unlist(ifelse(intersect(v1,v2)[1] == v1[1], list(v2), list(v1)))
w2 <- unlist(ifelse(intersect(v1,v2)[1] == v1[1], list(v1), list(v2)))
unique(lapply(setdiff(w2,w1), function(elmt) w <<- append(w, elmt, after = match(w2[match(elmt,w2)-1],w)))[[length(setdiff(w2,w1))]])
[1] "Z" "A" "T" "F" "Q" "D"
Upvotes: 1