Reputation: 1539
I have two data.table
that I want to merge based on two columns. One on exact match (station
) and the other on the nearest value (depth_m
).
library(data.table)
df1 <- data.table(
station = "a",
depth_m = 1
)
df2 <- data.table(
station = c("a", "a", "b"),
depth_m = c(1.1, 1.1, 2),
wavelength = c(300, 350, 300),
bbp = c(0.0012, 0.0013, 0.0014)
)
df1
#> station depth_m
#> 1: a 1
df2
#> station depth_m wavelength bbp
#> 1: a 1.1 300 0.0012
#> 2: a 1.1 350 0.0013
#> 3: b 2.0 300 0.0014
To do it so, I am using roll = "nearest"
as follow:
df2[df1, on = .(station, depth_m), roll = "nearest"]
#> station depth_m wavelength bbp
#> 1: a 1 300 0.0012
However, I would also get the row of df2
where wavelength == 350
so the result is:
data.table(
station = c("a", "a"),
depth = c(1, 1),
wavelength = c(300, 305),
bbp = c(0.0012, 0.0013)
)
#> station depth wavelength bbp
#> 1: a 1 300 0.0012
#> 2: a 1 305 0.0013
Hence, I have tried to use mult = "all"
without sucess:
df2[df1, on = .(station, depth_m), roll = "nearest", mult = "all"]
#> station depth_m wavelength bbp
#> 1: a 1 300 0.0012
Any help is appreciated.
Created on 2021-02-06 by the reprex package (v1.0.0)
Upvotes: 3
Views: 741
Reputation: 67828
In theory you should be able to use mult = "all"
. However, there are two open issues on this topic which suggests that currently this doesn't work as expected. In with floats, roll="nearest", mult="all" erroneously gives only a single match a workaround was suggested, where the join column was converted to integer, which then returned the correct number of rows.
In a follow-up issue, using a later version data.table
version (Join on integers, roll = "nearest" and mult = "all" erroneously gives only a single match), the workaround no longer worked.
Upvotes: 3