Find the nearest value in a column of grouped data, and then their corresponding rows in R

Question

Say I have a data.frame and the value 5.2

df = data.frame(x = c(1,2,3,4,5,6, 1,2,3,4,5,6)+0.1, treat = c(rep('high',6), rep('low',6)))
df
my.val = 5.2

And I would like to find the nearest value in column 'x' to my value, and then the rows they are in. Without grouping the data, I would use something like:

which(abs(df$x-my.val)==min(abs(df$x-my.val)))
#[1] 5 11

But how do you do this with a grouped table (in this case grouping by 'treat')?

library(dplyr)
df %>%  group_by(treat) %>% which(abs(df$x-my.val)==min(abs(df$x-my.val)))
Error in which(., abs(df$x - my.val) == min(abs(df$x - my.val))) : 
  argument to 'which' is not logical

Note: This is a simplified example. I have 45 groups in the original data set.

akrun · Accepted Answer

The which should be inside the summarise

library(dplyr)
df %>%
      group_by(treat) %>%
      summarise(i = which.min(abs(x - my.val)))

Or if we have multiple element in 'my.val'

library(purrr)
df %>%
    group_by(treat) %>%
    summarise(i = map_int(my.val, ~ which.min(abs(x - .x))))

Or may use findInterval as well

df %>%
    group_by(treat) %>% 
    summarise(i = findInterval(my.val, x))

Find the nearest value in a column of grouped data, and then their corresponding rows in R

Answers (1)

Related Questions