Philippe Massicotte
Philippe Massicotte

Reputation: 1539

Multiple matches nearest roll join with data.table

I have two data.table that I want to merge based on two columns. One on exact match (station) and the other on the nearest value (depth_m).

library(data.table)

df1 <- data.table(
  station = "a",
  depth_m = 1
)

df2 <- data.table(
  station = c("a", "a", "b"),
  depth_m = c(1.1, 1.1, 2),
  wavelength = c(300, 350, 300),
  bbp = c(0.0012, 0.0013, 0.0014)
)

df1
#>    station depth_m
#> 1:       a       1
df2
#>    station depth_m wavelength    bbp
#> 1:       a     1.1        300 0.0012
#> 2:       a     1.1        350 0.0013
#> 3:       b     2.0        300 0.0014

To do it so, I am using roll = "nearest" as follow:

df2[df1, on = .(station, depth_m), roll = "nearest"]
#>    station depth_m wavelength    bbp
#> 1:       a       1        300 0.0012

However, I would also get the row of df2 where wavelength == 350 so the result is:

data.table(
  station = c("a", "a"),
  depth = c(1, 1),
  wavelength = c(300, 305),
  bbp = c(0.0012, 0.0013)
)
#>    station depth wavelength    bbp
#> 1:       a     1        300 0.0012
#> 2:       a     1        305 0.0013

Hence, I have tried to use mult = "all" without sucess:

df2[df1, on = .(station, depth_m), roll = "nearest", mult = "all"]
#>    station depth_m wavelength    bbp
#> 1:       a       1        300 0.0012

Any help is appreciated.

Created on 2021-02-06 by the reprex package (v1.0.0)

Upvotes: 3

Views: 741

Answers (1)

Henrik
Henrik

Reputation: 67828

In theory you should be able to use mult = "all". However, there are two open issues on this topic which suggests that currently this doesn't work as expected. In with floats, roll="nearest", mult="all" erroneously gives only a single match a workaround was suggested, where the join column was converted to integer, which then returned the correct number of rows.

In a follow-up issue, using a later version data.table version (Join on integers, roll = "nearest" and mult = "all" erroneously gives only a single match), the workaround no longer worked.

Upvotes: 3

Related Questions