Reputation: 27
I want to change the following data set :
date A B
01/01/2018 391 585
02/01/2018 420 595
03/01/2018 455 642
04/01/2018 469 654
05/01/2018 611 900
06/01/2018 449 640
07/01/2018 335 522
08/01/2018 726 955
09/01/2018 676 938
10/01/2018 508 740
11/01/2018 562 778
12/01/2018 561 761
13/01/2018 426 609
14/01/2018 334 508
My desired output is as follows:
date A B
07/01/2018 3130 4538
14/01/2018 3793 5289
In which, the numbers of Columns A and B are summed over seven days of weeks. Indeed, I want to convert daily data to weekly data. I found two solutions on the Stackoverflow website. One solution would be using library(tidyquant), and following code
library(tidyquant)
newfd<-df %>%
tq_transmute(select = A,
mutate_fun = apply.weekly,
FUN = sum)
The code produces weekly data for Column A while I need all columns. (I have many columns). I also used the following code. But, I don’t know how I can develop the code for all columns.
library(slider)
slide_period_dfr(.x = califo, .i=as.Date(califo$date),
.period = "week",
.f = ~data.frame(week_ending = tail(.x$ date,1),
week_freq = sum(.x$A)),
.origin = as.Date("2018-01-01"))
Upvotes: 1
Views: 1300
Reputation: 1388
You can compute an index id = 0:(nrow(df) - 1)
, once the series is arranged by date, and use that to define to which period (week) belongs each date week = id %/% 7
. Date linked to each week is chosen in date = max(date)
to be the last date of the week. Other options are possible.
library(dplyr)
library(lubridate)
df <- tribble(~date, ~A, ~B,
"01/01/2018", 391, 585,
"02/01/2018", 420, 595,
"03/01/2018", 455, 642,
"04/01/2018", 469, 654,
"05/01/2018", 611, 900,
"06/01/2018", 449, 640,
"07/01/2018", 335, 522,
"08/01/2018", 726, 955,
"09/01/2018", 676, 938,
"10/01/2018", 508, 740,
"11/01/2018", 562, 778,
"12/01/2018", 561, 761,
"13/01/2018", 426, 609,
"14/01/2018", 334, 508)
df %>%
mutate(date = dmy(date)) %>%
arrange(date) %>%
mutate(id = 0:(nrow(df) - 1), week = id %/% 7) %>%
group_by(week) %>%
summarize(date = max(date), across(A:B, sum))
#> # A tibble: 2 x 4
#> week date A B
#> <dbl> <date> <dbl> <dbl>
#> 1 0 2018-01-07 3130 4538
#> 2 1 2018-01-14 3793 5289
Created on 2021-06-05 by the reprex package (v0.3.0)
Upvotes: 1
Reputation: 388817
You can use ceiling_date
to make the dates to weekly dates and sum
multiple variables with across
in dplyr
.
library(dplyr)
library(lubridate)
df %>%
group_by(date = ceiling_date(dmy(date), 'week', week_start = 1)) %>%
summarise(across(A:B, sum))
# date A B
# <date> <int> <int>
#1 2018-01-08 3130 4538
#2 2018-01-15 3793 5289
data
df <- structure(list(date = c("01/01/2018", "02/01/2018", "03/01/2018",
"04/01/2018", "05/01/2018", "06/01/2018", "07/01/2018", "08/01/2018",
"09/01/2018", "10/01/2018", "11/01/2018", "12/01/2018", "13/01/2018",
"14/01/2018"), A = c(391L, 420L, 455L, 469L, 611L, 449L, 335L,
726L, 676L, 508L, 562L, 561L, 426L, 334L), B = c(585L, 595L,
642L, 654L, 900L, 640L, 522L, 955L, 938L, 740L, 778L, 761L, 609L,
508L)), class = "data.frame", row.names = c(NA, -14L))
Upvotes: 2
Reputation: 612
You could pivot_longer()
so that you have only one column of data to transform, apply the function to that column, and then pivot_wider()
.
Here is a trivial example with mtcars
:
library(tidyverse)
mtcars %>%
rownames_to_column(var = "car") %>%
select(car, mpg, cyl) %>%
pivot_longer(cols = c(mpg, cyl), names_to = "var") %>%
mutate(value = value^2) %>%
pivot_wider(names_from = var, names_prefix = "squared_")
# A tibble: 32 x 3
car squared_mpg squared_cyl
<chr> <dbl> <dbl>
1 Mazda RX4 441 36
2 Mazda RX4 Wag 441 36
3 Datsun 710 520. 16
4 Hornet 4 Drive 458. 36
5 Hornet Sportabout 350. 64
6 Valiant 328. 36
7 Duster 360 204. 64
8 Merc 240D 595. 16
9 Merc 230 520. 16
10 Merc 280 369. 36
# … with 22 more rows
Your aggregation
would replace my mutate
step.
Whether this is more concise than repeatedly creating new variables depends on how many variables you are dealing with.
Upvotes: 0