Reputation: 467

Tidy way to pivot_wider by a range (given by values in two columns)?

I'm attempting to turn this

df <- structure(list(id = c(38320858L, 38408709L, 38314694L, 38285286L, 
38332258L), type = c("recreation", "business", "friends", "business", 
"recreation"), start_week = c(6, 8, 6, 6, 7), end_week = c(11, 
10, 11, 10, 11)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

# A tibble: 5 x 4
        id type       start_week end_week
     <int> <chr>           <dbl>    <dbl>
1 38320858 recreation          6       11
2 38408709 business            8       10
3 38314694 friends             6       11
4 38285286 business            6       10
5 38332258 recreation          7       11

into this:

result <- structure(list(type = c("recreation", "business", "friends", 
"recreation", "business", "friends", "recreation", "business", 
"friends", "recreation", "business", "friends", "recreation", 
"business", "friends", "recreation", "friends"), week = c(6L, 
6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 
11L), n = c(1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 
2L, 1L, 2L, 1L)), row.names = c(NA, -17L), class = "data.frame")

#          type week n
# 1  recreation    6 1
# 2    business    6 1
# 3     friends    6 1
# 4  recreation    7 2
# 5    business    7 1
# 6     friends    7 1
# 7  recreation    8 2
# 8    business    8 2
# 9     friends    8 1
# 10 recreation    9 2
# 11   business    9 2
# 12    friends    9 1
# 13 recreation   10 2
# 14   business   10 2
# 15    friends   10 1
# 16 recreation   11 2
# 17    friends   11 1

Note that

week is the start_week, end_week, and all the weeks in between
n is a count (and if n = 0 it can be omitted e.g. week 11 'business' is omitted, in other words a group_by( , by = c("type", "week") is fine, if that's helpful)

What I've tried

The tricky part of the problem is dealing with the range of weeks, and I can't figure out a tidy way to do that (i.e. a way without a loop).

In this attempt I handle only for single week column - it's only illustrative - it doesn't handle for the week range, e.g.

df %>% 
  select(-id, -end_week) %>% 
  mutate(n=1) %>% 
  pivot_wider(names_from = start_week, values_from = n, values_fill = list(n=0)) %>% 
  pivot_longer(`6`:`7`)

# A tibble: 9 x 3
  type       name  value
  <chr>      <chr> <dbl>
1 recreation 6         1
2 recreation 8         0
3 recreation 7         1
4 business   6         1
5 business   8         1
6 business   7         0
7 friends    6         1
8 friends    8         0
9 friends    7         0

Note that my attempt is quite useless since it's not dealing with the range of weeks at all

Upvotes: 2

Answers (3)

akrun

Reputation: 887831

We can also use rowwise

library(dplyr)
library(tidyr)
df %>% 
    rowwise %>%
    mutate(week = list(start_week:end_week)) %>% 
    unnest(c(week)) %>% 
    count(type, week) %>% 
    arrange(week)
# A tibble: 17 x 3
#   type        week     n
#   <chr>      <int> <int>
# 1 business       6     1
# 2 friends        6     1
# 3 recreation     6     1
# 4 business       7     1
# 5 friends        7     1
# 6 recreation     7     2
# 7 business       8     2
# 8 friends        8     1
# 9 recreation     8     2
#10 business       9     2
#11 friends        9     1
#12 recreation     9     2
#13 business      10     2
#14 friends       10     1
#15 recreation    10     2
#16 friends       11     1
#17 recreation    11     2

Upvotes: 0

Frank Zhang

Reputation: 1688

library(data.table)

setDT(df)

df[,.(type=type,week=seq(start_week,end_week)),by=seq_len(nrow(df))][,.(n=.N),by=.(type,week)][order(week)]
#>           type week n
#>  1: recreation    6 1
#>  2:    friends    6 1
#>  3:   business    6 1
#>  4: recreation    7 2
#>  5:    friends    7 1
#>  6:   business    7 1
#>  7: recreation    8 2
#>  8:   business    8 2
#>  9:    friends    8 1
#> 10: recreation    9 2
#> 11:   business    9 2
#> 12:    friends    9 1
#> 13: recreation   10 2
#> 14:   business   10 2
#> 15:    friends   10 1
#> 16: recreation   11 2
#> 17:    friends   11 1

^{Created on 2020-05-02 by the reprex package (v0.3.0)}

Upvotes: 4

Ronak Shah

Reputation: 389235

We could create a sequence between start_week and end_week, get them in separate rows using unnest and count the occurrence.

library(tidyverse)

df %>%
  mutate(week = map2(start_week, end_week, seq)) %>%
  unnest(week) %>%
  select(-start_week, -end_week) %>%
  count(type, week) %>%
  arrange(week)

# A tibble: 17 x 3
#   type        week     n
#   <chr>      <int> <int>
# 1 business       6     1
# 2 friends        6     1
# 3 recreation     6     1
# 4 business       7     1
# 5 friends        7     1
# 6 recreation     7     2
# 7 business       8     2
# 8 friends        8     1
# 9 recreation     8     2
#10 business       9     2
#11 friends        9     1
#12 recreation     9     2
#13 business      10     2
#14 friends       10     1
#15 recreation    10     2
#16 friends       11     1
#17 recreation    11     2

Upvotes: 3

Tidy way to pivot_wider by a range (given by values in two columns)?

What I've tried

Answers (3)

Related Questions