Reputation: 2627
I am looking for a practical way to retrieve the nearest value to 0
for each group using (preferably) data.table
.
Assume the following DT
:
set.seed(1)
library(data.table)
DT <- data.table(val = rnorm(1000), group = rep(1:10, each = 10)) # 10 groups
I have tried to combine both by = group
and roll = "nearest"
, but it only returns the nearest value across and not by groups:
DT[val == 0, val, by = group, roll = "nearest"]
# group value
#1: 8 0.001105352
I could of course repeat the process for each group, but it would be impractical as the number of groups increases. E.g.:
res <- rbind(DT[val == 0 & group = 1, val, by = group, roll = "nearest"],
DT[val == 0 & group = 2, val, by = group, roll = "nearest"],
DT[val == 0 & group = 3, val, by = group, roll = "nearest"],
...)
Maybe I am missing some data.table
feature?
Upvotes: 3
Views: 186
Reputation: 83275
You don't necessarily need a join for that.
A possible solution using a combination of min
and abs
:
DT[, .(closest.val.to.zero = val[abs(val) == min(abs(val))]), by = group]
which gives:
group closest.val.to.zero 1: 1 0.011292688 2: 2 -0.016190263 3: 3 0.002131860 4: 4 0.004398704 5: 5 0.017395620 6: 6 0.002415809 7: 7 0.004884450 8: 8 0.001105352 9: 9 -0.040150452 10: 10 -0.010925691
A more generalised way of the option as posted by @chinsoon12 in the comments:
DT[CJ(group = group, val = 0, unique = TRUE)
, on = .(group, val)
, .(group, closest.val.to.zero = x.val)
, roll = "nearest"]
Upvotes: 3