Pierre
Pierre

Reputation: 741

R - group_by & summarise as list - display data.frame

Edit: Sorry, I just realised I missed adding some info; There are multiple value cols

I'm trying to group_by a data.frame and summarise the values as lists. However, I also want to add additional information to the list.

Here's what I have tried and what I would like to achieve.

    df <- 
  data.frame(date = c("2020-01-01", "2020-01-01", "2020-01-01",
                    "2019-01-01", "2019-01-01", "2019-01-01"),
           company = c("A", "B", "C",
                       "A", "B", "C"),
           product = c("P1", "P1", "P2",
                       "P1", "P2", "P2"),
           value_1 = c(1,1,2,
                     2,3,4),
           value_2 = c(5,6,8,
                       9,1,3),
           stringsAsFactors = F) %>% as_tibble() 
df_sum <- 
  df %>% 
  group_by(date, product) %>% summarise_if(.predicate = is.numeric, .funs = function(x) list(x))

Instead of just showing the values in df_sum I would like to add respective company. I.e., df_sum$value_1[[2]] should return something like

  company value
1       B     3
2       C     4

instead of

[1] 3 4

and df_sum$value_2[[2]]

 company value_2
1       B       1
2       C       3

Just like Cettts answer, with the ability to show multiple cols

# A tibble: 4 x 4
  date       product value_1          value_2       
  <chr>      <chr>   <list>           <list> 
1 2020-01-01 P1      <tibble [2 x 2]> <tibble [2 x 2]>
2 2020-01-01 P2      <tibble [1 x 2]> <tibble [1 x 2]>
3 2019-01-01 P1      <tibble [1 x 2]> <tibble [1 x 2]>
4 2019-01-01 P2      <tibble [2 x 2]> <tibble [2 x 2]>

Upvotes: 0

Views: 59

Answers (1)

Cettt
Cettt

Reputation: 11981

You can use nest from the tidyr package for this:

df %>%
   tidyr::nest(value = c(company, value))
# A tibble: 4 x 3
  date       product value           
  <chr>      <chr>   <list>          
1 2020-01-01 P1      <tibble [2 x 2]>
2 2020-01-01 P2      <tibble [1 x 2]>
3 2019-01-01 P1      <tibble [1 x 2]>
4 2019-01-01 P2      <tibble [2 x 2]>

This returns a nested dataframe (or nested tibble to be more precise).

EDIT

Here is a solution for multiple value columns:

df_sum <- df %>%
   nest(value_1 = c(company, value_1), 
        value_2 = c(company, value_2))

Upvotes: 1

Related Questions