K_D
K_D

Reputation: 147

Group and compare numbers in a list

Consider the following data frame obtained after a cbind operation on two lists

> fl
  x meanlist
1 1     48.5
2 2     32.5
3 3     28.0
4 4     27.0
5 5     25.5
6 6     20.5
7 7     27.0
8 8     24.0

class_median <- list(0, 15, 25, 35, 45)
class_list <- list(0:10, 10:20, 20:30, 30:40, 40:50)

The values in class_median represent classes -10 to +10, 10 to 20, 20 to 30 etc

Firstly, I am trying to group the values in fl$meanlist as per the classes in class_list. Secondly, I am trying to return one value per class which is closest to the median values as follows

> fl_subset
  x meanlist cm
1 1     48.5 45
2 2     32.5 35
3 5     25.5 25

I am trying to use loops to compare but it seems to be long and unmanageable and the result is not correct

Upvotes: 1

Views: 50

Answers (2)

tmfmnk
tmfmnk

Reputation: 40141

One approach utilizing purrr and dplyr could be:

map2(.x = class_list,
     .y = class_median, 
     ~ fl %>%
      mutate(cm = between(meanlist, min(.x), max(.x))) %>%
      filter(any(cm)) %>%
      mutate(cm = cm*.y)) %>%
 bind_rows(.id = "ID") %>%
 group_by(ID) %>%
 slice(which.min(abs(meanlist-cm)))



  ID        x meanlist    cm
  <chr> <int>    <dbl> <dbl>
1 3         5     25.5    25
2 4         2     32.5    35
3 5         1     48.5    45

Upvotes: 0

Gregor Thomas
Gregor Thomas

Reputation: 146020

Here's an approach with dplyr:

library(dplyr)

# do a little prep--name classes, extract breaks, put medians in a data frame
names(class_list) = letters[seq_along(class_list)]
breaks = c(min(class_list[[1]]), sapply(class_list, max))
med_data = data.frame(median = unlist(class_median), class = names(class_list))


fl %>% 
  # assign classes
  mutate(class = cut(meanlist, breaks = breaks, labels = names(class_list))) %>%
  # get medians
  left_join(med_data) %>%
  # within each class...
  group_by(class) %>%
  # keep the row with the smallest absolute difference to the median
  slice(which.min(abs(meanlist - median))) %>%
  # sort in original order
  arrange(x)

# Joining, by = "class"
# # A tibble: 3 x 4
# # Groups:   class [3]
#       x meanlist class median
#   <int>    <dbl> <fct>  <dbl>
# 1     1     48.5 e         45
# 2     2     32.5 d         35
# 3     5     25.5 c         25

Upvotes: 2

Related Questions