RL_Pug
RL_Pug

Reputation: 867

How to make a frequency table out of dates but have every week as a part of the table?

So I have a data frame that looks like this.

library(lubridate)

data <- data.frame(date = c("7/29/2021 21:18", "7/6/2021 20:37", "7/23/2021 20:11", "7/23/2021 21:35", 
                "7/23/2021 21:42", "7/23/2021 21:47", "10/11/2021 14:45", "10/11/2021 15:57", 
                "10/11/2021 16:23", "7/29/2021 15:52", "9/7/2021 14:36", "8/31/2021 21:05", 
                "8/25/2021 18:22", "6/28/2021 14:52", "10/8/2021 19:34", "7/19/2021 15:38", 
                "10/27/2021 21:09", "10/8/2021 15:56", "9/21/2021 17:20", "8/25/2021 20:42", 
                "8/30/2021 20:28", "5/20/2021 14:58", "11/1/2021 21:09", "11/1/2021 20:15", 
                "8/5/2021 16:17", "6/30/2021 20:12", "5/14/2021 17:36", "5/19/2021 20:40", 
                "5/20/2021 21:06", "5/19/2021 20:59"))

data <- data %>% 
  mutate(Num = week(as.Date(data$date, format = "%m/%d/%Y"))) %>% 
  arrange(Num) %>% 
  group_by(Num) %>% 
  summarise(Count = n())


print(data)

# A tibble: 13 x 2
     Num Count
   <dbl> <int>
 1    20     5
 2    26     2
 3    27     1
 4    29     1
 5    30     6
 6    31     1
 7    34     2
 8    35     2
 9    36     1
10    38     1
11    41     5
12    43     1
13    44     2

My problem with this is that the weeks are in number format and there are some weeks missing for example 42. I would like to see missing weeks even if there is no count. Is there a better way to do this?

My desired output would be something like where I can see the first monday of the week

Num  Date       Count
1   1/4/2021     0
2   1/11/2021    1
3   1/18/2021    3

Best,

Upvotes: 0

Views: 208

Answers (1)

Oliver
Oliver

Reputation: 1280

To get the first day of every week, you can use floor_date() from lubridate. The week_start argument lets you start your weeks on Mondays, because it starts them on Sundays otherwise.

To fill in the 0s where you don't have any data, you can use tidyr's complete().

library(lubridate)
library(dplyr)
library(tidyr)

data <- tibble(date_chr = c(
  "7/29/2021 21:18",
  "7/6/2021 20:37",
  "7/23/2021 20:11",
  "7/23/2021 21:35",
  "7/23/2021 21:42",
  "7/23/2021 21:47",
  "10/11/2021 14:45",
  "10/11/2021 15:57",
  "10/11/2021 16:23",
  "7/29/2021 15:52",
  "9/7/2021 14:36",
  "8/31/2021 21:05",
  "8/25/2021 18:22",
  "6/28/2021 14:52"))


first_weekdays <- data %>% 
  mutate(date = as_date(mdy_hm(date_chr)),
         first_weekday = floor_date(date, "week", week_start = 1)) %>% 
  arrange(date) %>% 
  print()
#> # A tibble: 14 x 3
#>    date_chr         date       first_weekday
#>    <chr>            <date>     <date>       
#>  1 6/28/2021 14:52  2021-06-28 2021-06-28   
#>  2 7/6/2021 20:37   2021-07-06 2021-07-05   
#>  3 7/23/2021 20:11  2021-07-23 2021-07-19   
#>  4 7/23/2021 21:35  2021-07-23 2021-07-19   
#>  5 7/23/2021 21:42  2021-07-23 2021-07-19   
#>  6 7/23/2021 21:47  2021-07-23 2021-07-19   
#>  7 7/29/2021 21:18  2021-07-29 2021-07-26   
#>  8 7/29/2021 15:52  2021-07-29 2021-07-26   
#>  9 8/25/2021 18:22  2021-08-25 2021-08-23   
#> 10 8/31/2021 21:05  2021-08-31 2021-08-30   
#> 11 9/7/2021 14:36   2021-09-07 2021-09-06   
#> 12 10/11/2021 14:45 2021-10-11 2021-10-11   
#> 13 10/11/2021 15:57 2021-10-11 2021-10-11   
#> 14 10/11/2021 16:23 2021-10-11 2021-10-11

count_by_week <- first_weekdays %>% 
  count(first_weekday) %>% 
  complete(first_weekday = seq.Date(min(first_weekday),
                                    max(first_weekday),
                                    "7 days"),
           fill = list(n = 0)) %>% 
  print()
#> # A tibble: 16 x 2
#>    first_weekday     n
#>    <date>        <dbl>
#>  1 2021-06-28        1
#>  2 2021-07-05        1
#>  3 2021-07-12        0
#>  4 2021-07-19        4
#>  5 2021-07-26        2
#>  6 2021-08-02        0
#>  7 2021-08-09        0
#>  8 2021-08-16        0
#>  9 2021-08-23        1
#> 10 2021-08-30        1
#> 11 2021-09-06        1
#> 12 2021-09-13        0
#> 13 2021-09-20        0
#> 14 2021-09-27        0
#> 15 2021-10-04        0
#> 16 2021-10-11        3

Created on 2021-11-09 by the reprex package (v2.0.1)

Upvotes: 3

Related Questions