mat
mat

Reputation: 2627

Find nearest value by group

I am looking for a practical way to retrieve the nearest value to 0 for each group using (preferably) data.table.

Assume the following DT:

set.seed(1)
library(data.table)
DT <- data.table(val = rnorm(1000), group = rep(1:10, each = 10)) # 10 groups

I have tried to combine both by = group and roll = "nearest", but it only returns the nearest value across and not by groups:

DT[val == 0, val, by = group, roll = "nearest"]
#   group       value
#1:     8 0.001105352

I could of course repeat the process for each group, but it would be impractical as the number of groups increases. E.g.:

res <- rbind(DT[val == 0 & group = 1, val, by = group, roll = "nearest"],
             DT[val == 0 & group = 2, val, by = group, roll = "nearest"],
             DT[val == 0 & group = 3, val, by = group, roll = "nearest"],
             ...)

Maybe I am missing some data.table feature?

Upvotes: 3

Views: 186

Answers (1)

Jaap
Jaap

Reputation: 83275

You don't necessarily need a join for that.

A possible solution using a combination of min and abs:

DT[, .(closest.val.to.zero = val[abs(val) == min(abs(val))]), by = group]

which gives:

    group closest.val.to.zero
 1:     1         0.011292688
 2:     2        -0.016190263
 3:     3         0.002131860
 4:     4         0.004398704
 5:     5         0.017395620
 6:     6         0.002415809
 7:     7         0.004884450
 8:     8         0.001105352
 9:     9        -0.040150452
10:    10        -0.010925691

A more generalised way of the option as posted by @chinsoon12 in the comments:

DT[CJ(group = group, val = 0, unique = TRUE)
   , on = .(group, val)
   , .(group, closest.val.to.zero = x.val)
   , roll = "nearest"]

Upvotes: 3

Related Questions