Reputation: 896
One data table (let's call is A) contains the ID numbers:
ID
3
5
12
8
...
and another table (let's call it B) contains the lower bound and the upper bound and the name for that ID.
ID_lower ID_upper Name
1 4 James
5 7 Arthur
8 11 Jacob
12 13 Sarah
so based on table B, given the ID from table A, we can find the matching name by finding the name on the row in table B such that
ID_lower <= ID <= ID upper
and I wanna create a table of ID and Name, so in the above example, it would be
ID Name
3 James
5 Arthur
12 Sarah
8 Jacob
... ...
I used for loop, so that for each row of A, I look for the row in B such that ID is between the ID_lower and ID_upper for that row and joined the name from there. However, this method was a bit slow. Is there a fast way of doing it in R?
Upvotes: 0
Views: 120
Reputation: 118779
Using the new non-equi
joins feature in the current development version of data.table, this is straightforward:
require(data.table) # v1.9.7+
dt2[dt1, .(ID, Name), on=.(ID_lower <= ID, ID_upper >= ID)]
See the installation instructions for devel version here.
where,
dt1=fread('ID
3
5
12
8')
dt2 = fread('ID_lower ID_upper Name
1 4 James
5 7 Arthur
8 11 Jacob
12 13 Sarah')
Upvotes: 2
Reputation: 35314
I believe findInterval()
on ID_lower
might be the ideal approach here:
A[,Name:=B[findInterval(ID,ID_lower),Name]];
A;
## ID Name
## 1: 3 James
## 2: 5 Arthur
## 3: 12 Sarah
## 4: 8 Jacob
This will only be correct if (1) B
is sorted by ID_lower
and (2) all values in A$ID
are covered by the ranges in B
.
Upvotes: 1
Reputation: 214927
You can try this data.table
solution:
data.table::setDT(B)[, .(Name, ID = Map(`:`, ID_lower, ID_upper))]
[, .(ID = unlist(ID)), .(Name)][ID %in% A$ID]
Name ID
1: James 3
2: Arthur 5
3: Sarah 12
4: Jacob 8
Upvotes: 1
Reputation: 19544
You can make a look-up table with your second data.frame (B):
lu <- do.call(rbind,
apply(B,1,function(x)
data.frame(ID=c(x[1]:x[2]),Name=x[3], row.names = NULL)))
then you query it with your first data.frame (A):
A$Name <- lu[A$ID,"Name"]
Upvotes: 1