Reputation: 867
So I have a data frame that looks like this.
library(lubridate)
data <- data.frame(date = c("7/29/2021 21:18", "7/6/2021 20:37", "7/23/2021 20:11", "7/23/2021 21:35",
"7/23/2021 21:42", "7/23/2021 21:47", "10/11/2021 14:45", "10/11/2021 15:57",
"10/11/2021 16:23", "7/29/2021 15:52", "9/7/2021 14:36", "8/31/2021 21:05",
"8/25/2021 18:22", "6/28/2021 14:52", "10/8/2021 19:34", "7/19/2021 15:38",
"10/27/2021 21:09", "10/8/2021 15:56", "9/21/2021 17:20", "8/25/2021 20:42",
"8/30/2021 20:28", "5/20/2021 14:58", "11/1/2021 21:09", "11/1/2021 20:15",
"8/5/2021 16:17", "6/30/2021 20:12", "5/14/2021 17:36", "5/19/2021 20:40",
"5/20/2021 21:06", "5/19/2021 20:59"))
data <- data %>%
mutate(Num = week(as.Date(data$date, format = "%m/%d/%Y"))) %>%
arrange(Num) %>%
group_by(Num) %>%
summarise(Count = n())
print(data)
# A tibble: 13 x 2
Num Count
<dbl> <int>
1 20 5
2 26 2
3 27 1
4 29 1
5 30 6
6 31 1
7 34 2
8 35 2
9 36 1
10 38 1
11 41 5
12 43 1
13 44 2
My problem with this is that the weeks are in number format and there are some weeks missing for example 42. I would like to see missing weeks even if there is no count. Is there a better way to do this?
My desired output would be something like where I can see the first monday of the week
Num Date Count
1 1/4/2021 0
2 1/11/2021 1
3 1/18/2021 3
Best,
Upvotes: 0
Views: 208
Reputation: 1280
To get the first day of every week, you can use floor_date()
from lubridate. The week_start
argument lets you start your weeks on Mondays, because it starts them on Sundays otherwise.
To fill in the 0s where you don't have any data, you can use tidyr's complete()
.
library(lubridate)
library(dplyr)
library(tidyr)
data <- tibble(date_chr = c(
"7/29/2021 21:18",
"7/6/2021 20:37",
"7/23/2021 20:11",
"7/23/2021 21:35",
"7/23/2021 21:42",
"7/23/2021 21:47",
"10/11/2021 14:45",
"10/11/2021 15:57",
"10/11/2021 16:23",
"7/29/2021 15:52",
"9/7/2021 14:36",
"8/31/2021 21:05",
"8/25/2021 18:22",
"6/28/2021 14:52"))
first_weekdays <- data %>%
mutate(date = as_date(mdy_hm(date_chr)),
first_weekday = floor_date(date, "week", week_start = 1)) %>%
arrange(date) %>%
print()
#> # A tibble: 14 x 3
#> date_chr date first_weekday
#> <chr> <date> <date>
#> 1 6/28/2021 14:52 2021-06-28 2021-06-28
#> 2 7/6/2021 20:37 2021-07-06 2021-07-05
#> 3 7/23/2021 20:11 2021-07-23 2021-07-19
#> 4 7/23/2021 21:35 2021-07-23 2021-07-19
#> 5 7/23/2021 21:42 2021-07-23 2021-07-19
#> 6 7/23/2021 21:47 2021-07-23 2021-07-19
#> 7 7/29/2021 21:18 2021-07-29 2021-07-26
#> 8 7/29/2021 15:52 2021-07-29 2021-07-26
#> 9 8/25/2021 18:22 2021-08-25 2021-08-23
#> 10 8/31/2021 21:05 2021-08-31 2021-08-30
#> 11 9/7/2021 14:36 2021-09-07 2021-09-06
#> 12 10/11/2021 14:45 2021-10-11 2021-10-11
#> 13 10/11/2021 15:57 2021-10-11 2021-10-11
#> 14 10/11/2021 16:23 2021-10-11 2021-10-11
count_by_week <- first_weekdays %>%
count(first_weekday) %>%
complete(first_weekday = seq.Date(min(first_weekday),
max(first_weekday),
"7 days"),
fill = list(n = 0)) %>%
print()
#> # A tibble: 16 x 2
#> first_weekday n
#> <date> <dbl>
#> 1 2021-06-28 1
#> 2 2021-07-05 1
#> 3 2021-07-12 0
#> 4 2021-07-19 4
#> 5 2021-07-26 2
#> 6 2021-08-02 0
#> 7 2021-08-09 0
#> 8 2021-08-16 0
#> 9 2021-08-23 1
#> 10 2021-08-30 1
#> 11 2021-09-06 1
#> 12 2021-09-13 0
#> 13 2021-09-20 0
#> 14 2021-09-27 0
#> 15 2021-10-04 0
#> 16 2021-10-11 3
Created on 2021-11-09 by the reprex package (v2.0.1)
Upvotes: 3