Reputation: 1111
I would like to create a new database from the df
database I entered below. My idea is to create a base where only have one day per line. For example, instead of inserting 4 rows for 01/07/2021, it will only be 1, this way the values of the columns of those days will be added.
df <- structure(
list(Id=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1),
date1 = c("2021-07-01","2021-07-01","2021-07-01","2021-07-01","2021-04-02",
"2021-04-02","2021-04-02","2021-04-02","2021-04-02","2021-04-02","2021-04-03",
"2021-04-03","2021-04-03","2021-04-03","2021-04-03","2021-04-08","2021-04-08",
"2021-04-07","2021-04-09","2021-04-10","2021-04-10"),
Week= c("Thursday","Thursday","Thursday","Thursday","Friday","Friday","Friday","Friday",
"Friday","Friday","Saturday","Saturday","Saturday","Saturday","Saturday","Thursday",
"Thursday","Friday","Friday","Saturday","Saturday"),
DTPE = c("Ho","Ho","Ho","Ho","","","","","","","","","","","","","","","","Ho","Ho"),
D1 = c(8,1,9, 3,5,4,7,6,3,8,2,3,4,6,7,8,4,2,6,2,3), DR01 = c(4,1,4,3,3,4,3,6,3,7,2,3,4,6,7,8,4,2,6,7,3),
DR02 = c(8,1,4,3,3,4,1,6,3,7,2,3,4,6,7,8,4,2,6,2,3), DR03 = c(7,5,4,3,3,4,1,5,3,3,2,3,4,6,7,8,4,2,6,4,3),
DR04= c(4,5,6,7,3,2,7,4,2,1,2,3,4,6,7,8,4,2,6,4,3),DR05 = c(9,5,4,3,3,2,1,5,3,7,2,3,4,7,7,8,4,2,6,4,3)),
class = "data.frame", row.names = c(NA, -21L))
> df
Id date1 Week DTPE D1 DR01 DR02 DR03 DR04 DR05
1 1 2021-07-01 Thursday Ho 8 4 8 7 4 9
2 1 2021-07-01 Thursday Ho 1 1 1 5 5 5
3 1 2021-07-01 Thursday Ho 9 4 4 4 6 4
4 1 2021-07-01 Thursday Ho 3 3 3 3 7 3
5 1 2021-04-02 Friday 5 3 3 3 3 3
6 1 2021-04-02 Friday 4 4 4 4 2 2
7 1 2021-04-02 Friday 7 3 1 1 7 1
8 1 2021-04-02 Friday 6 6 6 5 4 5
9 1 2021-04-02 Friday 3 3 3 3 2 3
10 1 2021-04-02 Friday 8 7 7 3 1 7
11 1 2021-04-03 Saturday 2 2 2 2 2 2
12 1 2021-04-03 Saturday 3 3 3 3 3 3
13 1 2021-04-03 Saturday 4 4 4 4 4 4
14 1 2021-04-03 Saturday 6 6 6 6 6 7
15 1 2021-04-03 Saturday 7 7 7 7 7 7
16 1 2021-04-08 Thursday 8 8 8 8 8 8
17 1 2021-04-08 Thursday 4 4 4 4 4 4
18 1 2021-04-07 Friday 2 2 2 2 2 2
19 1 2021-04-09 Friday 6 6 6 6 6 6
20 1 2021-04-10 Saturday Ho 2 7 2 4 4 4
21 1 2021-04-10 Saturday Ho 3 3 3 3 3 3
Upvotes: 3
Views: 66
Reputation: 78917
Base R
with aggregate
:
aggregate(cbind(D1, DR01, DR02, DR03, DR04, DR05) ~ Id+date1+Week, df, sum)
Output:
Id date1 Week D1 DR01 DR02 DR03 DR04 DR05
1 1 2021-04-02 Friday 33 26 24 19 19 21
2 1 2021-04-07 Friday 2 2 2 2 2 2
3 1 2021-04-09 Friday 6 6 6 6 6 6
4 1 2021-04-03 Saturday 22 22 22 22 22 23
5 1 2021-04-10 Saturday 5 10 5 7 7 7
6 1 2021-04-08 Thursday 12 12 12 12 12 12
7 1 2021-07-01 Thursday 21 12 16 19 22 21
Upvotes: 1
Reputation: 224
You can perform this using the following code:
library(dplyr)
df %>%
group_by(Id, date1, Week) %>%
select(D1:DR05) %>%
summarise_all(sum)
# A tibble: 7 × 9
# Groups: Id, date1 [7]
Id date1 Week D1 DR01 DR02 DR03 DR04 DR05
<dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 2021-04-02 Friday 33 26 24 19 19 21
2 1 2021-04-03 Saturday 22 22 22 22 22 23
3 1 2021-04-07 Friday 2 2 2 2 2 2
4 1 2021-04-08 Thursday 12 12 12 12 12 12
5 1 2021-04-09 Friday 6 6 6 6 6 6
6 1 2021-04-10 Saturday 5 10 5 7 7 7
7 1 2021-07-01 Thursday 21 12 16 19 22 21
You might want to also convert the date1 field to a DATE object, but can do that using the lubridate verbs for e.g. ymd()
inside a mutate
Upvotes: 3
Reputation: 887028
We may do a grouping by 'Id', along with 'date1' and 'Week', then summarise
the numeric
columns to get the sum
in across
library(dplyr)
df %>% group_by(Id, date1, Week) %>%
summarise(across(where(is.numeric), sum, na.rm = TRUE), .groups = 'drop')
Upvotes: 3