Reputation: 65
Consider I have two vectors
a <- c(1,3,5,7,9, 23,35,36,43)
b <- c(2,4,6,8,10,24, 37, 45)
Please notice the length
of both are different.
I want to find the gap/difference/sequence between two vectors to match based on closest proximity.
Expected Output
a b
1 2
3 4
5 6
7 8
9 10
23 24
35 NA
36 37
43 45
Please notice that 35
has NA
against it because 36
has a sequence matching/closest proximity with 37
.
Upvotes: 5
Views: 99
Reputation: 323226
You can using findInterval
df=data.frame(a)
df$b[findInterval(b, a)]=b
df
a b
1 1 2
2 3 4
3 5 6
4 7 8
5 9 10
6 23 24
7 35 NA
8 36 37
9 43 45
Upvotes: 5
Reputation: 3188
This algorithm can only deal with one NA. For N possible NA's, you just have to try all combination(length(b), N)
possibilities. Tries to find min(abs(a-b))
for every possible NA insertion slot.
# Try insertion
Map(f = function(i) mean(abs(append(b, NA, i) - a), na.rm = T),
i = 1:length(b)) %>%
# Find index of the best insertion spot
which.min %>%
# Actually insert
{append(b, NA, .)} %>%
# Display data
{cbind(a, b = .)}
a b
[1,] 1 2
[2,] 3 4
[3,] 5 6
[4,] 7 8
[5,] 9 10
[6,] 23 24
[7,] 35 NA
[8,] 36 37
[9,] 43 45
Upvotes: 1