Anoushiravan R
Anoushiravan R

Reputation: 21908

Grouping by each value of a column based on the categories of a list they fall into

Today has been quite challenging so I can't think of any new ideas anymore so the solution to this question may be quite obvious to you. I have a very simple data frame like bellow:

structure(list(user_id = c(101, 102, 102, 103, 103, 106, 107, 
111), phone_number = c(4030201, 4030201, 4030202, 4030202, 4030203, 
4030204, 4030205, 4030203), id = 1:8), class = "data.frame", row.names = c(NA, 
-8L))

and also a list:

list(c(1, 2, 3, 4, 5, 8), 6, 7)

I want to group each value in id column of my data frame based on the category they fall into of elements of the list preferably with purrr package functions. So the desired output is something like this:

grp <- c(1, 1, 1, 1, 1, 2, 3, 1)

Thank you very much in advance and learning from/ beside you guys has been a great honor of my life.

Sincerely Yours

Anoushiravan

Upvotes: 2

Views: 84

Answers (2)

AnilGoyal
AnilGoyal

Reputation: 26218

Case-I when the list is unnamed

df <- structure(list(user_id = c(101, 102, 102, 103, 103, 106, 107,
                           111), phone_number = c(4030201, 4030201, 4030202, 4030202, 4030203,
                                                  4030204, 4030205, 4030203), id = 1:8), class = "data.frame", row.names = c(NA,
                                                                                                                             -8L))
lst <- list(c(1, 2, 3, 4, 5, 8), 6, 7)
library(tidyverse)

df %>% mutate(GRP = map(id, \(xy) seq_along(lst)[map_lgl(lst, ~ xy %in% .x)]))
#>   user_id phone_number id GRP
#> 1     101      4030201  1   1
#> 2     102      4030201  2   1
#> 3     102      4030202  3   1
#> 4     103      4030202  4   1
#> 5     103      4030203  5   1
#> 6     106      4030204  6   2
#> 7     107      4030205  7   3
#> 8     111      4030203  8   1

Case-II when the list is named

df <- structure(list(user_id = c(101, 102, 102, 103, 103, 106, 107,
                           111), phone_number = c(4030201, 4030201, 4030202, 4030202, 4030203,
                                                  4030204, 4030205, 4030203), id = 1:8), class = "data.frame", row.names = c(NA,
                                                                                                                             -8L))
lst <- list(a = c(1, 2, 3, 4, 5, 8), b = 6, c = 7)

library(tidyverse)

df %>% mutate(GRP = map(id, \(xy) names(lst)[map_lgl(lst, ~ xy %in% .x)]))
#>   user_id phone_number id GRP
#> 1     101      4030201  1   a
#> 2     102      4030201  2   a
#> 3     102      4030202  3   a
#> 4     103      4030202  4   a
#> 5     103      4030203  5   a
#> 6     106      4030204  6   b
#> 7     107      4030205  7   c
#> 8     111      4030203  8   a

Created on 2021-06-14 by the reprex package (v2.0.0)

Upvotes: 1

tmfmnk
tmfmnk

Reputation: 39858

One option involving purrr could be:

df %>%
    mutate(grp = imap(lst, ~ .y * (id %in% .x)) %>% reduce(`+`))

  user_id phone_number id grp
1     101      4030201  1   1
2     102      4030201  2   1
3     102      4030202  3   1
4     103      4030202  4   1
5     103      4030203  5   1
6     106      4030204  6   2
7     107      4030205  7   3
8     111      4030203  8   1

Upvotes: 3

Related Questions