Reputation: 1037
I want to do something similar to vlookup
in R, using numeric variable as a basis.
Sample lookup table:
> Value <- c(1,1.5,2,2.5,3,3.5,4,4.5,5)
> Code <- c("A","B","C","D","E","F","G","H","I")
> Lookup_Table <- data.frame(Value, Code)
> Lookup_Table
Value Code
1 1.0 A
2 1.5 B
3 2.0 C
4 2.5 D
5 3.0 E
6 3.5 F
7 4.0 G
8 4.5 H
9 5.0 I
Sample data table:
> DataSample <- c(1.2,1,2.3,2.7,3.1,3,4.6,4.5,3.8)
> DataSample <- data.frame(DataSample)
> DataSample
DataSample
1 1.2
2 1.0
3 2.3
4 2.7
5 3.1
6 3.0
7 4.6
8 4.5
9 3.8
So from this DataSample
values I want to match the corresponding Code
based on the lookup table values. And if my value is for example 1.2
, I want to round it up to the nearest value on the lookup table which is 1.5
. So I expect to get the corresponding code for 1.5
.
My desired output is:
> DataSample
DataSample Code
1 1.2 B
2 1.0 A
3 2.3 D
4 2.7 E
5 3.1 F
6 3.0 E
7 4.6 I
8 4.5 H
9 3.8 G
Upvotes: 2
Views: 266
Reputation: 101663
A data.table
option with non-equi join
setorder(
setDT(Lookup_Table),
"Value"
)[setDT(DataSample),
on = .(Value >= DataSample)
][
,
.(Code = first(Code)), .(DataSample = Value)
]
which gives
DataSample Code
1: 1.2 B
2: 1.0 A
3: 2.3 D
4: 2.7 E
5: 3.1 F
6: 3.0 E
7: 4.6 I
8: 4.5 H
9: 3.8 G
Upvotes: 2
Reputation: 30474
A base R approach could use findInterval
like this:
DataSample$Code <- with(Lookup_Table,
Code[findInterval(DataSample$DataSample, Value, left.open = T) + 1])
Output
DataSample Code
1 1.2 B
2 1.0 A
3 2.3 D
4 2.7 E
5 3.1 F
6 3.0 E
7 4.6 I
8 4.5 H
9 3.8 G
Upvotes: 3
Reputation: 6979
Here I have used data.table
to:
foverlaps
function to do the mergevalue <- c(1,1.5,2,2.5,3,3.5,4,4.5,5)
code <- c("A","B","C","D","E","F","G","H","I")
Lookup_Table <- data.frame(value, code)
setDT(Lookup_Table)
Lookup_Table <- Lookup_Table[order(value)]
Lookup_Table[, previous.value := shift(value)]
Lookup_Table[, next.value := shift(value, type = "lead")]
Lookup_Table[, start := (previous.value + value) / 2]
Lookup_Table[, end := (next.value + value) / 2]
Lookup_Table[is.na(start), start := value]
Lookup_Table[is.na(end), end := value]
Lookup_Table <- Lookup_Table[, .(start, end, value, code)]
setkey(Lookup_Table, start, end)
DataSample <- data.frame(value = c(1.2,1,2.3,2.7,3.1,3,4.6,4.5,3.8))
setDT(DataSample)
DataSample[, start := value]
DataSample[, end := value]
DataSample <- DataSample[, .(start, end, value)]
setkey(DataSample, start, end)
res <- foverlaps(
DataSample,
Lookup_Table,
by.x = c("start", "end"),
by.y = c("start", "end")
)
res <- res[, .(value = i.value, code)]
> res
# value code
# 1: 1.0 A
# 2: 1.2 A
# 3: 2.3 D
# 4: 2.7 D
# 5: 3.0 E
# 6: 3.1 E
# 7: 3.8 G
# 8: 4.5 H
# 9: 4.6 H
Result is slightly different, you probably want to play with how ranges are defined and applied
Upvotes: 2