Return closest values to a specific column in R

Question

My dataset contains 500 observations. Here is any example of the data structure:

df <- data.frame(rating_mean=c(3.6, 4.0, 3.7, 4.8, 3.9, 5.1, 4.1, 4.3 ),
             actual_truth=c("true", "false", "false", "true", "true", "false", "false", "true"))

I would like to return the 60 items with a rating_mean closest to the value of 3.5 for "true" stimuli, and the same for "false" stimuli (so a total of 120 items). So far I have this but it's not correct:

df50 <- df %>%   group_by(actual_truth) %>%   top_n(n = 60, wt = rating_mean - 3.5)

Thank you.

akrun · Accepted Answer

One option is to arrange by 'actual_truth' and the absolute difference between the 'rating_mean' and threshold value, then grouped by 'actual_truth', slice the first 60 observations

library(dplyr)
df %>% 
   arrange(actual_truth, abs(rating_mean - 3.5)) %>% 
   group_by(actual_truth) %>%
   slice(seq_len(60))

Return closest values to a specific column in R

Answers (1)

Related Questions