Reputation: 121
Right now I have two data frames:
amount = c(19,21,39,45,62,71,100,121,130,160,180,210,240)
id = rep(1,length(amount))
test <- data.frame(id,amount)
interval = c(1:10)
bottom = c(0,25,50,75,100,125,150,175,200,225)
top = c(24,49,74,99,124,149,174,199,224,NA)
test_2 <- data.frame(interval,bottom,top)
test
contains values of increasing amounts and test_2
contains intervals. I am trying to join the two frames so that for every amount in test
I get the correspondent ìnterval nr. as a new column, say test$intervall
. I tried foverlaps
already like this:
setDT(test)
setDT(test_2)
foverlaps(test_2,test,by.x = amount, by.y = c(bottom,top), type = "within")
However, this returns an error message, saying
Invalid numeric value for 'by.x'; it should be a vector with values 1 <= by.x <= length(x)
Maybe one of you can help me out here, thanks!
Upvotes: 1
Views: 120
Reputation: 388817
You can try fuzzyjoin
:
fuzzyjoin::fuzzy_left_join(test, test_2,
by = c('amount' = 'top', 'amount' = 'bottom'),
match_fun = list(`<=`, `>=`))
# id amount interval bottom top
#1 1 19 1 0 24
#2 1 21 1 0 24
#3 1 39 2 25 49
#4 1 45 2 25 49
#5 1 62 3 50 74
#6 1 71 3 50 74
#7 1 100 5 100 124
#8 1 121 5 100 124
#9 1 130 6 125 149
#10 1 160 7 150 174
#11 1 180 8 175 199
#12 1 210 9 200 224
#13 1 240 NA NA NA
Upvotes: 1
Reputation: 11584
Does this work:
library(dplyr)
library(tidyr)
test_2 %>% mutate(top = replace_na(top, 250)) %>% rowwise() %>%
mutate(iv = list(seq(bottom, top, by = 1))) %>%
unnest(iv) %>% right_join(test, by = c('iv' = 'amount'), keep = T) %>%
select(id, interval, amount)
# A tibble: 13 x 3
id interval amount
<dbl> <int> <dbl>
1 1 1 19
2 1 1 21
3 1 2 39
4 1 2 45
5 1 3 62
6 1 3 71
7 1 5 100
8 1 5 121
9 1 6 130
10 1 7 160
11 1 8 180
12 1 9 210
13 1 10 240
Upvotes: 1