BrownNation
BrownNation

Reputation: 65

Finding Sequences [gap or difference] between two vectors

Consider I have two vectors

a <- c(1,3,5,7,9, 23,35,36,43)
b <- c(2,4,6,8,10,24, 37, 45)

Please notice the length of both are different.

I want to find the gap/difference/sequence between two vectors to match based on closest proximity.

Expected Output

a     b
1     2
3     4
5     6
7     8
9     10
23    24
35    NA
36    37
43    45

Please notice that 35 has NA against it because 36 has a sequence matching/closest proximity with 37.

Upvotes: 5

Views: 99

Answers (2)

BENY
BENY

Reputation: 323226

You can using findInterval

df=data.frame(a)
df$b[findInterval(b, a)]=b
df
   a  b
1  1  2
2  3  4
3  5  6
4  7  8
5  9 10
6 23 24
7 35 NA
8 36 37
9 43 45

Upvotes: 5

Vlo
Vlo

Reputation: 3188

This algorithm can only deal with one NA. For N possible NA's, you just have to try all combination(length(b), N) possibilities. Tries to find min(abs(a-b)) for every possible NA insertion slot.

  # Try insertion
  Map(f = function(i) mean(abs(append(b, NA, i) - a), na.rm = T),
      i = 1:length(b)) %>%
  # Find index of the best insertion spot
  which.min %>%
  # Actually insert
  {append(b, NA, .)} %>%
  # Display data
  {cbind(a, b = .)}

       a  b
 [1,]  1  2
 [2,]  3  4
 [3,]  5  6
 [4,]  7  8
 [5,]  9 10
 [6,] 23 24
 [7,] 35 NA
 [8,] 36 37
 [9,] 43 45

Upvotes: 1

Related Questions