Reputation: 564
week = seq(as.Date("2022-9-1"), as.Date("2022-10-1"), by="weeks")
visits = c("[1,2,6,4,6,11,19]", "[12,2,7,8,6,11,22]",
"[8,2,5,8,4,15,13]", "[9,2,7,4,6,11,7]",
"[25,34]")
data = data.frame(week, visits)
The above data frame records daily visit in a string column, and each row represents a week.
I would like to convert the above dataset to a long table with daily visits. This is the expected output:
date | visits |
---|---|
2022-9-1 | 1 |
2022-9-2 | 2 |
2022-9-3 | 6 |
2022-9-3 | 6 |
.... | |
2022-9-30 | 34 |
I have tried the following steps: (1) break down the string using strsplit
and remove the first and last item (which is [
and ]
); (2) store each day's visit in separate columns; (3) use dplyr::mutate
and dplyr::complete
to fill the dates between weeks; (4) then loop over each first day of the week to make each week's wide data to long data. All these add to 60ish lines of code, where I feel that there might be an easier solution.
Thank you so much in advance!
Upvotes: 0
Views: 111
Reputation: 886968
Perhaps this helps
library(dplyr)
library(tidyr)
library(stringr)
data %>%
mutate(visits = str_remove_all(visits, "\\]|\\[")) %>%
separate_rows(visits) %>%
group_by(week) %>%
mutate(date = seq(first(week), length.out = n(),
by = '1 day'), .before = 'week') %>%
ungroup %>%
select(-week)
-output
# A tibble: 30 × 2
date visits
<date> <chr>
1 2022-09-01 1
2 2022-09-02 2
3 2022-09-03 6
4 2022-09-04 4
5 2022-09-05 6
6 2022-09-06 11
7 2022-09-07 19
8 2022-09-08 12
9 2022-09-09 2
10 2022-09-10 7
# … with 20 more rows
Upvotes: 2