Reputation: 1529
I have a dataframe where I would like to select within each group the lines where y
is the closest to a specific value (ex.: 5).
set.seed(1234)
df <- data.frame(x = c(rep("A", 4),
rep("B", 4)),
y = c(rep(4, 2), rep(1, 2), rep(6, 2), rep(3, 2)),
z = rnorm(8))
df
## x y z
## 1 A 4 -1.2070657
## 2 A 4 0.2774292
## 3 A 1 1.0844412
## 4 A 1 -2.3456977
## 5 B 6 0.4291247
## 6 B 6 0.5060559
## 7 B 3 -0.5747400
## 8 B 3 -0.5466319
The result would be:
## x y z
## 1 A 4 -1.2070657
## 2 A 4 0.2774292
## 3 B 6 0.4291247
## 4 B 6 0.5060559
Thank you, Philippe
Upvotes: 3
Views: 48
Reputation: 887048
Here is an option with data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'x', we create get the absolute difference of 'y' with 5, check for elements that are min
from the difference, get the row index (.I
), extract the column that is row index ("V1") and subset the dataset.
library(data.table)
setDT(df)[df[, {v1 <- abs(y-5)
.I[v1==min(v1)]}, x]$V1]
# x y z
#1: A 4 -1.2070657
#2: A 4 0.2774292
#3: B 6 0.4291247
#4: B 6 0.5060559
Upvotes: 1
Reputation: 9618
Alternatively using base R:
df[do.call(c, tapply(df$y, df$x, function(x) x-5 == max(x - 5))),]
x y z
1 A 4 -1.2070657
2 A 4 0.2774292
5 B 6 0.4291247
6 B 6 0.5060559
Upvotes: 3
Reputation: 18487
df %>%
group_by(x) %>%
mutate(
delta = abs(y - 5)
) %>%
filter(delta == min(delta)) %>%
select(-delta)
Upvotes: 4