Reputation: 467
I'm attempting to turn this
df <- structure(list(id = c(38320858L, 38408709L, 38314694L, 38285286L,
38332258L), type = c("recreation", "business", "friends", "business",
"recreation"), start_week = c(6, 8, 6, 6, 7), end_week = c(11,
10, 11, 10, 11)), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
# A tibble: 5 x 4
id type start_week end_week
<int> <chr> <dbl> <dbl>
1 38320858 recreation 6 11
2 38408709 business 8 10
3 38314694 friends 6 11
4 38285286 business 6 10
5 38332258 recreation 7 11
into this:
result <- structure(list(type = c("recreation", "business", "friends",
"recreation", "business", "friends", "recreation", "business",
"friends", "recreation", "business", "friends", "recreation",
"business", "friends", "recreation", "friends"), week = c(6L,
6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L,
11L), n = c(1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L,
2L, 1L, 2L, 1L)), row.names = c(NA, -17L), class = "data.frame")
# type week n
# 1 recreation 6 1
# 2 business 6 1
# 3 friends 6 1
# 4 recreation 7 2
# 5 business 7 1
# 6 friends 7 1
# 7 recreation 8 2
# 8 business 8 2
# 9 friends 8 1
# 10 recreation 9 2
# 11 business 9 2
# 12 friends 9 1
# 13 recreation 10 2
# 14 business 10 2
# 15 friends 10 1
# 16 recreation 11 2
# 17 friends 11 1
Note that
group_by( , by = c("type", "week")
is fine, if that's helpful)The tricky part of the problem is dealing with the range of weeks, and I can't figure out a tidy way to do that (i.e. a way without a loop).
In this attempt I handle only for single week column - it's only illustrative - it doesn't handle for the week range, e.g.
df %>%
select(-id, -end_week) %>%
mutate(n=1) %>%
pivot_wider(names_from = start_week, values_from = n, values_fill = list(n=0)) %>%
pivot_longer(`6`:`7`)
# A tibble: 9 x 3
type name value
<chr> <chr> <dbl>
1 recreation 6 1
2 recreation 8 0
3 recreation 7 1
4 business 6 1
5 business 8 1
6 business 7 0
7 friends 6 1
8 friends 8 0
9 friends 7 0
Note that my attempt is quite useless since it's not dealing with the range of weeks at all
Upvotes: 2
Views: 222
Reputation: 887831
We can also use rowwise
library(dplyr)
library(tidyr)
df %>%
rowwise %>%
mutate(week = list(start_week:end_week)) %>%
unnest(c(week)) %>%
count(type, week) %>%
arrange(week)
# A tibble: 17 x 3
# type week n
# <chr> <int> <int>
# 1 business 6 1
# 2 friends 6 1
# 3 recreation 6 1
# 4 business 7 1
# 5 friends 7 1
# 6 recreation 7 2
# 7 business 8 2
# 8 friends 8 1
# 9 recreation 8 2
#10 business 9 2
#11 friends 9 1
#12 recreation 9 2
#13 business 10 2
#14 friends 10 1
#15 recreation 10 2
#16 friends 11 1
#17 recreation 11 2
Upvotes: 0
Reputation: 1688
library(data.table)
setDT(df)
df[,.(type=type,week=seq(start_week,end_week)),by=seq_len(nrow(df))][,.(n=.N),by=.(type,week)][order(week)]
#> type week n
#> 1: recreation 6 1
#> 2: friends 6 1
#> 3: business 6 1
#> 4: recreation 7 2
#> 5: friends 7 1
#> 6: business 7 1
#> 7: recreation 8 2
#> 8: business 8 2
#> 9: friends 8 1
#> 10: recreation 9 2
#> 11: business 9 2
#> 12: friends 9 1
#> 13: recreation 10 2
#> 14: business 10 2
#> 15: friends 10 1
#> 16: recreation 11 2
#> 17: friends 11 1
Created on 2020-05-02 by the reprex package (v0.3.0)
Upvotes: 4
Reputation: 389235
We could create a sequence between start_week
and end_week
, get them in separate rows using unnest
and count
the occurrence.
library(tidyverse)
df %>%
mutate(week = map2(start_week, end_week, seq)) %>%
unnest(week) %>%
select(-start_week, -end_week) %>%
count(type, week) %>%
arrange(week)
# A tibble: 17 x 3
# type week n
# <chr> <int> <int>
# 1 business 6 1
# 2 friends 6 1
# 3 recreation 6 1
# 4 business 7 1
# 5 friends 7 1
# 6 recreation 7 2
# 7 business 8 2
# 8 friends 8 1
# 9 recreation 8 2
#10 business 9 2
#11 friends 9 1
#12 recreation 9 2
#13 business 10 2
#14 friends 10 1
#15 recreation 10 2
#16 friends 11 1
#17 recreation 11 2
Upvotes: 3